Episode 37: Data Privacy in the Age of AI
Download MP3[00:00:00] Dr Genevieve Hayes: Hello and welcome to Value Driven Data Science brought to you by Genevieve Hayes Consulting. I'm Dr. Genevieve Hayes and today I'm joined by Dr. Katherine Kemp to discuss data privacy challenges in the age of AI. Katherine is an Associate Professor in UNSW's Faculty of Law and Justice and Deputy Director of the Allens Hub for Technology, Law and Innovation.
[00:00:27] Her research focuses on competition, data privacy, and consumer protection regulation, including their application to digital platforms. Catherine, welcome to the show.
[00:00:39] Dr Katharine Kemp: Thanks. Good to be here.
[00:00:40] Dr Genevieve Hayes: Most people have come to accept that the price of living in a technological world is some loss of data privacy. However, few realize just how much privacy they're giving up.
[00:00:53] Now, to be perfectly honest, Even though I've been working with data my entire adult life, in preparing for this episode, I was shocked to learn just how much my data privacy as a consumer has been eroded in recent years. And much of what I learned, I owe to your excellent research, Catherine. To give our listeners a taste of what's to come in this episode and possibly to scare them out of their minds Catherine, could you give us a brief overview of some of the areas you've explored through your research into data privacy?
[00:01:35] Dr Katharine Kemp: Yes, and thank you so much. And I'm sorry that you, as other people have mentioned to me have had that disturbing feeling upon reading this research. But I do think it is important information for us to begin to see breadth and depth of data that's collected on us across our daily lives.
[00:01:58] Because it's Obviously becoming so much more invisible, especially to the average person when data is being collected, let alone who else will see that data and the purposes for which it will be used. So I have, for example, in the consumer context, taken a look at areas as diverse as motor vehicles, picking up all kinds of data from both the sensors in the car itself and the phones app, which of course can range from all of the various contacts and entertainment settings, information on the person's mobile device, as well as sensors in the car potentially cameras, microphones, tracking of driver behavior in terms of braking, acceleration, obviously location, data, all brought together in a real time way to communicate.
[00:03:02] Form that highly detailed picture of what it is the person is doing on a daily basis, who they're associating with, where they're going, bearing in mind that when it comes to location data, just where it is that we travel to can reveal highly sensitive aspects of our life in terms of our religious beliefs.
[00:03:24] Our medical needs our family and it's intimate contacts
[00:03:30] in a very different area, I've also looked at fertility apps, which many people use sometimes just to track their periods. But also in their attempts to get pregnant or to manage. Their symptoms of menopause over time.
[00:03:45] And it was very concerning to see not just the range of data that was collected which went well beyond what was necessary to provide that service. to consumers but also the way that notices and settings were presented to consumers so that they wouldn't be aware of the full extent to which that information is used and disclosed to others and retained for some really surprising periods of time.
[00:04:17] One of these apps, for example, would keep users data for seven years after they stopped using the app.
[00:04:25] Dr Genevieve Hayes: Oh, dear.
[00:04:26] Dr Katharine Kemp: It leaves an awfully long time for something to go wrong, as you can imagine.
[00:04:31] Dr Genevieve Hayes: Oh, yeah. Yeah, I can imagine that the person who came up with whatever the agreement was that people agreed to was. Someone who, you know, a lot of data scientists think, how can I get as much data as I can in order to do all this analysis? And they're not thinking about potential data breaches like the Optus data breach, which can happen if they're not careful.
[00:04:56] Dr Katharine Kemp: It is really interesting to see how diametrically opposed the perspective can be between privacy lawyers and advocates and data scientists, data analysts in that perspective.
[00:05:08] Yeah. Obviously, on the privacy side, one of the number 1 principles of privacy by design is data minimization. So that you're only asking for what it is that you need to provide that particular service to fill that particular function because. Showing that restraint means that you will have less to worry about later.
[00:05:32] There's less data that you need to keep safe. There's less that you need to keep an eye on as you retain it to make sure that you're not retaining it for longer than you actually should. And there's less to track down and delete when you no longer need it. Data minimization certainly isn't the theme of a lot of the conversations and presentations I've heard from data scientists and data analysts.
[00:06:02] And I shouldn't say that across the board. It's not uniformly. So there are some people who are very conscious of the risk. But for some it can be very much a matter of let's collect as much as possible from as broad a range of sources as possible for as long as possible so that we can find out what we might be able to do with that data and the exciting opportunities to make inferences discern patterns and Achieve, I'm sure in some cases, some beneficial results, but not always being aware of the risk that goes with that.
[00:06:40] Dr Genevieve Hayes: As a data scientist, my natural instinct is that whole Pokemon got to catch them all attitude towards data. But as a consumer, I don't want Optus having my data for any longer than it needs to, or, any of these things that lead to data breaches. And I have to remind myself that. Both of these attitudes are incompatible with each other.
[00:07:03] Dr Katharine Kemp: Yeah, and I think it is something that we don't tend to think about enough, at least in Australia, from an early age. We don't have those conversations about, for example, what is the value of privacy? Why do we care about privacy? Why do we need it? You don't have those in primary school or even in high school and it means that I've sometimes been having conversations, many conversations with economists who still don't as adults understand why it has Any significance in our lives, save that it is seen as something that some people have some personal preferences about this.
[00:07:52] That can't really be explained, but you provide them with those choices that, you know, you can manage and otherwise, it's really just some people being precious without understanding the bigger picture of where privacy fits in In our society as a whole, well beyond any personal preferences.
[00:08:16] Dr Genevieve Hayes: I'm interested in what you just said there about economists not being able to understand this. I mean, it doesn't surprise me. Can you give an example of what sort of data they think people are being overly precious about?
[00:08:30] Dr Katharine Kemp: Yeah, so I can give you some examples of comments. I've heard in conferences and so forth, because being a competition lawyer, I'm offered at conferences where there'll be both lawyers and economists. And I shouldn't lump this all on economists because frankly, some of the lawyers feel the same way.
[00:08:46] So I don't want to be drawing that line too firmly. But for example in the context of a discussion about digital platforms not too long ago I made the point on a panel that privacy is not Just the same as other qualities of products and services that we deal with where we're interested in consumer preferences, but it might come down to sort of questions of color or questions of speed and various other attributes that you're considering.
[00:09:19] But I made the point that privacy, in contrast, is actually a human right. And so we're not only concerned with what a particular consumer might want, or the idea that we can somehow let them sell some of their privacy. For the sake of convenience, but that we need to consider this as a fundamental right and one, which is necessary to our development as humans and critical for our autonomy and dignity as humans.
[00:09:53] And this inspired. A very quick reaction from a couple of professors of economics who said this is what's wrong with discussions like this. Starting to talk about privacy as a right distorts its significance and seem quite upset by that, but also some of the conversation then moved to.
[00:10:14] I don't care who knows I bought a casserole dish on Amazon last week if they want to know that fine. And I've had others say, you know, I don't care who finds out that I've got glaucoma or Any number of things. So it's both focusing on their own experience of life and a fairly limited understanding of what information is actually being collected, but also, as you probably have heard.
[00:10:47] People say many times that sense of I personally have nothing to hide. So how can this possibly matter? It's all preference.
[00:10:59] Dr Genevieve Hayes: It's like when you go to buy underwear. Everyone goes to buy underwear , but you don't want to put it on Facebook and have the world know what brand and color you actually went and bought.
[00:11:10] Dr Katharine Kemp: Yes, I think you're right. And I think that at some level, everybody realizes. That they do have some boundaries that there's somewhere that they wouldn't want the cameras to follow them and they wouldn't want every last bit of information to be known about them that in fact, we do all have something to hide and aside from that for us in our personal experience, very often the people who are saying.
[00:11:41] I've got nothing to hide. If they really analyzed it, they'd see that perhaps what they mean is that they had the good fortune to be born in the so called right neighborhood, come from the right ethnic background, and have an education that looked great, to have a credit history that looked fabulous, to have seemingly good health.
[00:12:08] And all kinds of predictors of risk that looked exactly right to somebody analyzing their lives. Now, for starters, they might be wrong about all of that but even assuming they're right, it's ignoring the fact that That for many people, that detailed picture of their life can lead to prejudice, exclusion, discrimination, exploitation of vulnerabilities, and we don't only need to think.
[00:12:40] Of those who are looking good when their lives are exposed, but to look across the whole spectrum and to be most concerned with those who are most vulnerable when their entire lives are exposed.
[00:12:55] Dr Genevieve Hayes: Did you ever read the books The Circle and The Every by Dave Eggers?
[00:13:01] Dr Katharine Kemp: No,
[00:13:03] Dr Genevieve Hayes: I can't remember which one this happened in, because one's the sequel to the other. The Circle is a company that's basically Facebook, and the Every is, imagine a merger between Facebook and Amazon,
[00:13:17] Dr Katharine Kemp: you gotcha.
[00:13:19] Dr Genevieve Hayes: and basically what they deal with is Yeah.
[00:13:21] Imagine a world where there is zero privacy on the internet and you've got constant monitoring of people. And initially people are doing it because, you know, like how people would. Put their whole lives on social media now, and people are doing it like that. But then it becomes, if you're not showing every detail of your life on social media, what have you got to hide?
[00:13:47] And it ends up with everyone constantly censoring themselves and basically being able to not function in this world.
[00:13:58] Dr Katharine Kemp: Yeah, and you do see that we're moving towards that well and truly already. If you consider, for example, that at present, we've got insurance companies that are expecting people to have insurance. Yeah. Devices in their cars that's tracking their driver behavior through telematics. And if they don't want that device in their car, there's an assumption that they don't want their behavior to be known.
[00:14:25] It must be bad behavior. If you don't want us to know about it, but that's playing out in other areas as well. In social media, for example, I have friends who come from countries where. That history that they have with their home country means that they don't want to have an online presence at all, but that has an impact on their social life when they attempt to start dating people.
[00:14:49] And it's seen as suspicious if you don't have Facebook, Instagram, Twitter. LinkedIn anything, then what again, what are you trying to hide? It must be a bad thing. And that's very much not just a social reaction to these aspects, but we have seen economists, Richard Posner, for example, who've written on the economics of it.
[00:15:15] Of privacy and claimed that privacy is actually very inefficient because it allows people to hide information about themselves that other people need to know. And he sees that as essentially deceptive to be withholding some of that information, rather than seeing it from the view of some philosophers and sociologists who would say, of course.
[00:15:42] All of us have multiple versions of ourselves that are not suitable for consumption by every single person who we deal with. That would be terrible for me to be standing up in front of my class, who I lecture at the university, and telling them every detail of the history of my life. Entirely inappropriate, distracting, and there's no way Of it either helping them or me or being seen in any context.
[00:16:13] So it is a very natural part of being human that we present different versions of ourselves in different contexts. And is seen as a good thing from the perspective of some of the social sciences but certainly there are economists and others who would think it's much better if we can see the whole lot, the whole time.
[00:16:37] Dr Genevieve Hayes: I think everyone has the right to change their life and to move forward and not to have to be saddled by the baggage of their past if it's not relevant.
[00:16:51] Dr Katharine Kemp: Yeah, and I think that's a really critical aspect. There is that relevance and in some cases considering whether entirely unnecessary information is being used to exclude somebody from certain offers. or to target them with offers that they're regarded as susceptible to. I mean, we've seen this go to the extent of a marketing article that I read that was explaining to marketers how to work out when women feel Least attractive, like the day of the week, the day of the month, the time of day when they feel least attractive and so would be most receptive to hearing about your cosmetics or cosmetic procedures and that you can actually start to sort this out from observing their online behavior.
[00:17:46] And I'm sure I'm sure people could work that out about. But a lot of us but whether that's a good thing, it's an entirely different question. I
[00:17:56] Dr Genevieve Hayes: Yeah, it's frightening. But as we were saying before, when a consumer signs up for a new app or digital service, you normally have to fill out that consent form or tick the consent box for the terms and conditions, which no one ever reads. And you have to agree to them using your personal data.
[00:18:17] And they usually promise you that they won't share your data or will only use it in an anonymized form. Now, I have a very clear idea of what I consider to be my personal data and the sorts of data I expect service providers to be collecting about me when I give my consent, but from reading your papers in preparing for this interview, I think my views were rather naive.
[00:18:48] What are some of the more surprising data points companies are now collecting about their customers?
[00:18:54] Dr Katharine Kemp: think one of the biggest things that consumers wouldn't tend to be aware of is how much data matching and so called data enrichment happens behind the scenes in that. Most consumers, they'd be thinking they've got a fair idea of firstly, the information they handed over themselves in name, email address, certain preferences for using an app and their actual use of an app or websites, the purchases they've made.
[00:19:24] They've got an idea that that ended up with the other company. Some might be vaguely aware that they're usage data is being collected as well to see how they behave from page to page and what they focus on. And that that can be collected over time to reveal patterns and preferences far less visible is that data that's collected from other.
[00:19:49] Companies sometimes from data, broking services in the form of data enrichment or from data matching with other businesses where you are either providing an email address, but very often now providing a hashed email address between different companies, which may be 2 or many companies to bring together.
[00:20:15] All of them, and that each of have on the 1 individual who uses that original email address to form that far more detailed 360 degree view of the consumer. Now, I think a lot of the businesses that are engaging in that data matching and data enrichment that ends up with a person just having handed over an email address and perhaps their name, and then it's filled out with their age range and their income range and their family situation and their purchase intention.
[00:20:55] And even potentially down to their health situation that a lot of those companies are thinking this is okay, because somewhere in the fine print of our privacy policies, we had some vague term that said, we could use this data for, data analytics purposes or research or for marketing and so there's consent there and somebody's got some consent.
[00:21:21] So we'll be all right. And really we're just forming this. More detailed views so we can understand the consumer better. But that's certainly a practice that we know consumers are deeply uncomfortable with. We have plenty of consumer surveys conducted by the OAIC, that's our privacy regulator, the AC c and the Consumer Policy Research Center that show that consumers.
[00:21:53] Don't want businesses sharing their information with other businesses that they have no relationship with or connecting and combining that information. A bigger issue for the businesses as well is that it goes against an existing law in Australia. That requires that that information, personal information should be collected directly from the individual themselves, unless it's unreasonable or impracticable, impracticable means practically impossible to collect it from the individual.
[00:22:30] Dr Genevieve Hayes: So basically ask the person directly and don't try and go around behind their back. Yes, but I'm
[00:22:36] Dr Katharine Kemp: It seems a simple rule. And the puzzling thing is that on the whole, it hasn't been enforced against companies doing data matching and doing data enrichment. And my guess is that we have a privacy regulator that we know to be. Under resourced and the idea of taking on these practices in some cases on the part of some very large data, broking services would be a daunting enforcement task, but it doesn't make the practices legal.
[00:23:16] Dr Genevieve Hayes: You mentioned data enrichment. What exactly do you mean by data enrichment?
[00:23:21] Dr Katharine Kemp: This is the euphemistic, obviously, term that's used by a number of data providers, some of the data brokers that span many countries. And it's offered. To consumer facing businesses, mainly retailers to say, essentially, if you have this customer database full of email addresses of your customers, you can come to us with.
[00:23:50] The email address of each of those individuals and we can fill in the gaps on the information you don't know about them. So you might only know their address and the purchases that they've made from your company, but we can tell you. their age or age range, their income range their family situation, whether they're married or unmarried, whether they've got kids in primary school or high school, babies, whether they're planning to buy a home, do they have a mortgage intention and so forth.
[00:24:25] So all kinds of intentions and personal attributes that you haven't been able to collect from the person themselves and very likely. If you ask them themselves, they tell you to get lost because they have no intention of sitting down and telling you all those personal details, but the service is instead offered by a third party.
[00:24:50] Dr Genevieve Hayes: How do businesses find these third parties? Because, I mean, okay, I've never actually Googled data broker. Would you just Google data broker in order to find one? I
[00:25:00] Dr Katharine Kemp: Yeah, there's some very large companies and you can start Googling. I don't want to go telling everyone how to get their data enrichment services, but if you Google data enrichment, sometimes it will have slightly different names, but you'll see major companies providing this service and it will sometimes be presented in that way as a source of information that's It's only going in in 1 direction, but very often it is instead of data matching arrangement between various businesses, bringing both of their customer databases together and performing that data matching to see If we either connect the email addresses up themselves, or very often put this email addresses through the same hash function, so they spit out that each unique email address a unique alpha numeric string then how many of them match up?
[00:26:01] And so what information can we put together on our shared customers?
[00:26:08] Dr Genevieve Hayes: get it. So this is how you'd de anonymize anonymous data in the matching, I'd have the customer list with all the true email addresses, I'd hash it using this agreed upon hashing algorithm, then you'd have in your company say, do exactly the same thing. We'd match them using the hashes.
[00:26:32] And then because I know. The email address with this particular hash corresponds to whatever, I can get those additional details.
[00:26:42] Dr Katharine Kemp: That's exactly it.
[00:26:44] Dr Genevieve Hayes: Ah, so that's sort of getting around these, yes, we anonymize our data promises.
[00:26:51] Dr Katharine Kemp: And so what you see then in privacy terms with companies who are doing this is kinds of terminology that you just know consumers are not going to understand. It was saying, we may exchange pseudonymized data with our trusted data partners to understand your needs, for example.
[00:27:14] You know, the average person reading pseudonymized I remember saying it to my mom for the 1st time and she said, that sounds very rude. It's not, but interestingly, it's something that we looked into recently in a consumer survey to see what it is that consumers understand about all of this terminology that's used here.
[00:27:39] both by retailers and by data broking services to describe what they're doing with consumers data. So the Consumer Policy Research Center conducted this consumer survey, which I helped them to design, and we wrote a report together on the findings. And we found that most Australian consumers don't understand really important terms that.
[00:28:07] Are used to describe these data practices, and that includes terms like anonymized pseudonymized hashed email addresses, aggregate data, audience data and the. Interesting additional finding that went with that was that the less recognition consumers had of those terms, the less likely they thought that data could be used to track and monitor and profile them.
[00:28:41] It seems like that lack of recognition makes it sound like. It must be innocuous, or at least they're uncertain enough about the meaning that they don't think it can be used to track them in the same way as say, the map of their location data or their search history. So. What it seems like is that companies are quite likely deliberately using terminology that's not familiar to consumers that's in the fine print of privacy policies to justify these kind of data matching and data enrichment practices and create what I really think is fictional consent on the part of consumers for those practices.
[00:29:31] Dr Genevieve Hayes: If someone got hold of this anonymized data or pseudonymized data and they didn't know the hashing algorithm that was used and didn't know that one-to-one mapping between the email addresses and the hashing algorithm, would they be able to de anonymize the data?
[00:29:52] Dr Katharine Kemp: This is another important point about this report where we are trying to clarify some terminology. The report is called Singled Out by the way, and it goes into this consumer understanding and misunderstanding of data broking services and privacy terms, but. The other thing that we're doing is pointing out that a lot of those terms that you've just mentioned, they don't have any fixed meaning.
[00:30:19] They don't have a meaning under Australian law or even a fixed meaning in custom among businesses in Australia. We've been mentioning anonymized data a couple of times, that's not a term that has any legal meaning in Australia. There's such thing as anonymous data under the EU GDPR and the UK GDPR, but not in Australia.
[00:30:43] We have de-identified information under the Privacy Act that has a meaning. But in, in terms of making sure that that information. Does not identify or cannot reasonably make an individual reasonably identifiable rather but very often firms are not using the terminology of the Privacy Act, but coming up with all kinds of different descriptions of data that doesn't make it clear whether they're Claiming this is personal information or de identified information.
[00:31:19] And those 2 classifications are the only ones that matter under the Privacy Act, because the Privacy Act applies to personal information, and it doesn't apply once you've made that information de identified information. So that's just a clarification on terminology, but to go back to your question, if you had.
[00:31:40] A customer database where you had pseudonymized the email addresses, say, by, by hashing them, then it is possible that another entity receiving that database wouldn't be able to identify the individuals, but it's also possible that they could, it depends what other information is in there with the hashed email address?
[00:32:04] And we will see over time that especially with advances in machine learning, that collection of information with no name or identifier may still allow. an individual to become identifiable, even if we haven't used those traditional identifiers. At the same time, we've also got to bear in mind. That that list in that pseudonymized state is only 1 step away from becoming personal information.
[00:32:39] If somebody does get hold of the hash function to be able to find out which of these email addresses becomes that alphanumeric string on using an alphanumeric string. that particular hash algorithm. So both of those factors would need to be borne in mind in knowing that it is not necessarily de identified information.
[00:33:02] Dr Genevieve Hayes: And I could imagine also that if all these companies are using the same software, you could probably make a pretty good guess at what hashing function they're using because it's the built in one and then all you'd have to know is the random seed. And
[00:33:18] Dr Katharine Kemp: Sometimes, yes, I think that would be the case. I think sometimes it would be guessable. Other times they might claim that this is not something that can be guessed and that they've taken extra trouble over. But even in those situations. You still have the possibility that the data is linked back again to the person.
[00:33:41] And even if they never do that through the hashing itself as I mentioned before, the collection of information. With no name, no email address can still allow you to single a person out. So we've seen great research from some academics who used to be at the University of Melbourne when they did this research, Vanessa Teague, for example, is now at ANU.
[00:34:08] But looking at how health data sets can be re identified. And just as an example, pointing out that on the whole, you might think if you had the collection of birth mothers, birth dates, and the birth dates of their biological children in a particular month. Region that you wouldn't necessarily be able to identify them without further details.
[00:34:36] But when you stop and think about it if you start to have. Not just 1 child, but a 2nd child within a region. It might even be a state of Australia. The likelihood of a biological mother being born on a certain date and then having. Both the 1st and the 2nd child born on those dates is vanishingly small, but even if you've only got 1 child, if the biological mother is above a certain age or below a certain age, she may be the only one who shares.
[00:35:13] That birth date and the birth date of her child. So that's just 1 example of the ways that seemingly de identified data can actually single out the individual concerned.
[00:35:26] Dr Genevieve Hayes: When I was doing my master's, I was using a dataset and it was an anonymized Student results data set, grades for a particular course. And it was a course that I'd actually taken, and there were about a thousand records in this data set from various years, but I was the only Australian female in the data set so I could actually find my record in it.
[00:35:52] I didn't know anyone else. I didn't actually look to try and find anyone else because I think that's morally wrong, but. I, because I had such unique characteristics, it was possible for me to at least de anonymise my own record. One
[00:36:10] Dr Katharine Kemp: and that's quite right. And if you start to combine data sets. Multiple data sets that are supposedly de identified, you can easily see how that can result in those extra patterns emerging and excluding certain possibilities and whittling it down until you can tell who it is that you are dealing with and single out the individual.
[00:36:38] Dr Genevieve Hayes: of the things that a lot of people will do in order to try and get around this sort of Sneaky behavior on the part of companies is use things like incognito mode on browsers like Google or not sign into a particular website. If you're using say incognito mode or just haven't signed into a website, would you really be anonymous?
[00:37:02] Or is that just a myth?
[00:37:05] Dr Katharine Kemp: Yeah, it's becoming more and more limited. The steps that people can take to protect themselves, bearing in mind, especially the extent of cross device tracking and cross website and app tracking that occurs. And so you see, for example the collection of data that reveals that's. You know, certain mobile phone and a certain laptop have been following the same location pattern and therefore assumed to belong to the same person so that even though they might not be logging in on one of those devices, they might have logged into some service on the other device.
[00:37:47] That allows information shared between companies to reveal who the individual was, even though they made the decision not to log in. 1 of the biggest examples I saw of this was a location data broker that's based in the States that also provides services in Australia. And it had patented this particular system and explained the rationale for its invention as.
[00:38:14] That Apple has introduced this app tracking transparency framework that allows consumers to signal that they don't want to be tracked across apps. And so we have this problem in their view that consumers will ask not to be tracked and therefore we need. A new way of tracking them. So the idea that is explained in this patent is that they found 27 other signals that they can pick up from this individual's life through various different devices and various different ways of discerning somebody's location to you.
[00:39:00] Work out in spite of their opting not to be tracked, how to track them nonetheless. So I, I mean, that's where you just shake your head . But that is the attitude for a lot of companies is to say, you know, right. Google's deprecating third party cookies. How do we replicate third party cookies through some other means?
[00:39:29] It, it seems to be a recurring theme.
[00:39:32] Dr Genevieve Hayes: Yeah. And these people probably make a lot of money from this too. That's the scary part.
[00:39:38] Dr Katharine Kemp: It reminds me often of like if you turned around to somebody on the path and said, stop following me and right, I'll start hiding behind the trees as I go, you don't mind that it really, the, the answer is not getting through.
[00:39:57] Dr Genevieve Hayes: How do I get a restraining order against Google and Apple?
[00:40:01] Dr Katharine Kemp: Yeah, good luck.
[00:40:02] Dr Genevieve Hayes: Yeah. So. What are the privacy laws like in Australia? what are people legally allowed to collect? And is there any real consumer protection for people here?
[00:40:16] Dr Katharine Kemp: So the main law, of course, is the Privacy Act and that applies to federal government agencies as well as businesses that make 3Million dollars or more a year which does. For starters cut out over 90 percent of our businesses because we have this small business exemption that is unusual.
[00:40:37] Other countries don't tend to have it. And there are some small businesses that still are covered by the Act. Maybe they provide a health service or they might be actually in a data business in the sense of trading in data, and they will be covered by the act. Now the Australian Privacy Principles, which set out the obligations for dealing with personal information.
[00:41:00] Most of them come from the enactment of this legislation back in 1988. So you can well imagine that they have not kept pace and there are some fundamental changes that are needed as we've been acknowledging them. Over the past couple of years as the Privacy Act review has been going on. One of the big things here is exactly what we've been talking about is what is personal information?
[00:41:29] That's critical because that's what the Act applies to. And that's where a lot of companies at the moment are attempting to argue. Even if they seem to be providing a data broker service, they can attempt to argue. That they're not actually dealing in personal information, even though you can distinguish the person they're describing from all other people.
[00:41:57] So getting the definition of personal information, right? And bringing it into line with the GDPR, for instance, is a massive foundational step in updating our privacy law.
[00:42:11] Dr Genevieve Hayes: Is Australia going to be in line with the GDPR anytime soon?
[00:42:15] Dr Katharine Kemp: No, no time soon. We can certainly get a little closer. But there are various proposals for how we could come closer to the GDPR in that definition of personal information, but also in how we define consent. At the moment, our Privacy Act defines consent only as express, and Or implied consent, and as you can imagine, a lot of businesses like that implied consent option and use it to argue that well, we had a privacy policy on our website and so we assume that if somebody is using the website or the app that they have read that entire privacy policy.
[00:42:56] And understood it and worked out the consequences that would flow from these data practices. And so we've given notice, and this is all okay. And just on that, you know, sometimes we start beating ourselves up as consumers and say, we're also lazy. We don't read these privacy policies. And the research has been done, it would take people 6 working weeks a year to read all of the privacy terms that apply to them.
[00:43:21] So it's ridiculous to suggest that we're going to do that. Even if we were able to understand them and make choices based on them, which we can't, it's much more a story. Of learned helplessness. So we need to change our definition of consent, but even more than that, we need to move away from a consent approach that puts it down onto individual consumers to say.
[00:43:46] Work out how all of this, these data practices function nowadays, and the full map of where your personal information is moving between all of these companies and countries and the consequences of that, and then try and make some decisions. It has to be. That we have more substantive rules about what's fair and what's reasonable.
[00:44:09] And that we create boundaries in our law rather than just saying, well, if the consumer consented to it, it's. All okay,
[00:44:18] Dr Genevieve Hayes: Yeah. So it sounds like we don't really have any protection at all in Australia.
[00:44:23] Dr Katharine Kemp: that it is very limited protection. And in addition to that, we have an under resourced privacy regulator. So, you know, while we have pecuniary penalties available under our privacy legislation the privacy regulator has brought 1 action seeking this penalty against meta now over the last year.
[00:44:46] Past almost decade that that's been possible. And so it's not as if company directors are lying awake at night, worrying about whether they're going to have a pecuniary penalty under the Privacy Act imposed on them.
[00:45:02] Dr Genevieve Hayes: And impact has the widespread adoption of generative AI tools such as chatGPT had on consumer data privacy?
[00:45:14] Dr Katharine Kemp: This obviously adds another complication because Everyone and their dog is busy plugging information into the these generative AI apps in an attempt to get somebody else to do their homework. And so we immediately have sometimes people putting confidential information in there. We've seen lawyers and accountants get into trouble for putting client information in there.
[00:45:42] But aside from breaches of confidentiality, there's clearly personal information issues now that in part comes from the information that might be import by a person using the app, but also from the nature of the app itself in collecting up information. All of that data from the Internet and using it for this new purpose and one of the fundamental principles of data protection is the purpose limitation principle and that need to establish that.
[00:46:19] Data is only used for the purpose for which it was originally collected or a related purpose, unless you have the express consent of the consumer. And so that's going to be a very difficult thing to establish in many cases where that personal information may be used in the case of generative AI.
[00:46:42] Dr Genevieve Hayes: So if I put in. Personal data relating to a client say it's possible that data might be used to train the model at some future point in time and that data might get spat out to some person who shouldn't have that data. Is
[00:47:00] Dr Katharine Kemp: Yes, you have those situations, but you also have the data that the I was originally trained on. That is another question of whether they had the right to collect and use that information in the 1st place. So those are the kinds of. Actions that we see being brought in the European Union in challenging the very business model and the technical aspects of how these apps were created.
[00:47:32] I think in some cases, people feel that the, the genie's out of the bottle. Are they really going to undo what has been done in this case? But I think, not only in the case of AI, but in. Technological advances, generally, you can start to see a pattern of companies deciding to take that approach that we'd rather ask forgiveness than ask permission.
[00:48:05] And once you've gone far enough down a track. Who's going to stop you?
[00:48:11] Dr Genevieve Hayes: So what steps can consumers take to protect their privacy when dealing with the online world?
[00:48:18] Dr Katharine Kemp: It's certainly not easy. They can and I would hope many people would vote with their feet in favor of companies that particularly providing privacy enhancing tools and, maybe making the choice in favor of Proton mail and the like, or using the end-to-end encryption messaging apps.
[00:48:42] If that's what they need. They've got the possible choices with VPNs and certain modes that they can use with various browsers. And we know that there are browsers that. Offer less tracking than others. And it's very good in my view that you start to see some of those features being highlighted when you're.
[00:49:07] Looking for those services for the 1st time, but something that I think we really need to acknowledge is that you can make all of those choices to some extent in your personal life. Perhaps say, choosing that email provider is one of the easier ones. But then if you decide for yourself, okay, when it comes to social media, I'm not dealing with Facebook or Instagram and my kids are not going to deal with Snapchat and tick tock and so forth, you're going to have a very small audience.
[00:49:39] So, Friend group. I mean, I don't have Facebook and Instagram, but I do have a very small friend group, but it's not, you know what I mean? Aside from joking that you can't just recreate those network effects somewhere else. And aside from those social aspects, you also have to consider how many times that your employer or your university or your child's.
[00:50:07] school decides for you what it is that you're going to use as an app or a website or a platform. That is an endless tension and one where I've had a lot of personal experience of trying to chat with various companies that I have to deal with like that about better privacy choices. And when they're making the decisions for you, it is naturally going to be led by what is easiest and most convenient and cheapest rather than what is best for the privacy of our staff, students and parents and families.
[00:50:47] Dr Genevieve Hayes: What final advice would you give to data scientists looking to create business value from data?
[00:50:54] Dr Katharine Kemp: Do you know, something that I've noticed is that I watch a lot of these videos of conferences that are just meant for marketing people and really insider chats. And there's this interesting thing that tends to happen at the end, where they save a couple of humanizing questions and say, as individuals.
[00:51:14] ourselves. How do we feel about this? And I'm so intrigued to see that very often they will become a bit philosophical and say, you know, it's really quite creepy, isn't it? And I don't like it in my own life. And I think we've got to stop pretending and so on. So I really do believe that for a lot of people, philosophically, they Understand that this is important and it's important to how we function as a society in allowing each other those spaces where we experiment and have our free thought and have our intimate relationships that allow our society as a whole to be stronger.
[00:51:55] I think that creating services that are not just. Immediately trusted, but actually trustworthy is critical to be able to be transparent with your customers and show that your service earns their trust by being restrained in the ways that data is trusted. First collected in minimizing that and used and disclosed rather than trying to do a marketing spiel to sell it to them after the fact is becoming more and more important.
[00:52:37] Our consumer surveys tell us repeatedly that consumers are anxious and angry and frustrated about how their data is treated. That is what we've seen in this latest survey in the report that I mentioned earlier called Singled Out. Consumers feel out of control. And earning back that trust, I think, will be vital.
[00:53:03] Dr Genevieve Hayes: For listeners who want to learn more about you or get in contact, what can they do?
[00:53:08] Dr Katharine Kemp: Well, firstly, you can take a look at the presently the Ellen's Hub for Technology, Law, and Innovation, where a lot of our research Is stored as well as taking a look at the SSRN page where my research is listed and you'll find a lot of interesting reports on the Consumer Policy Research Center website as well.
[00:53:33] Dr Genevieve Hayes: Thank you for joining me today, Catherine.
[00:53:35] Dr Katharine Kemp: That's only a pleasure. Great to be with you.
[00:53:38] Dr Genevieve Hayes: And for those in the audience, thank you for listening. I'm Dr. Genevieve Hayes, and this has been value driven data science brought to you by Genevieve Hayes Consulting.
Creators and Guests
