IAPP Webinar: What to Do When Concept Doesn’t Work?
How De-Identification Requirements under CCPA and HIPAA Differ

Presentation Transcript
Dave:
Welcome to the IAPP Web Conference. What to Do When Consent Doesn't Work under CCPA: How De-Identification Requirements under CCPA and HIPAA Differ, brought you today by Anonos.

My name is Dave Cohen. I'm the IAPP’s Knowledge Manager and I'll be your host for today's program. We'll be getting started with the presentation in just a minute, but before we do, a few program details.

Participating in today's program will automatically provide IAPP-certified privacy professionals or the named registrants with one CPE credit. Those who are listening in can apply for those credits with an easy-to-use online form on our website.

I’d also like to remind you that today’s program is being recorded and will be provided free to registered attendees, approximately 48 hours following the live event.

We encourage you to ask questions at any time during the program by typing them into the Q&A field that's just to the right of your PowerPoint window and your questions will be answered by the presenters after the presentation during a designated Q&A period.

Now, onto our program and I'd like to introduce today's panelists.
[00:01:02]
Gary LaFever is the CEO over at Anonos. Gary, welcome to the panel and can you tell us a little bit about your background and the company?
Gary:
Thank you, Dave. I appreciate that. So I am a cofounder, CEO and just as importantly, general counsel at Anonos and our whole focus is technology that goes into the fundamental legal requirements to enable CCPA compliant de-identification and GDPR pseudonymisation.

So it's technical solutions that solve legal challenges.
Dave:
Fantastic. Thanks, Gary. Joining Gary in the panel is Justin Antonipillai. He’s the CEO at WireWheel. Justin, welcome and can you tell us a little bit about your background and your company?
Justin:
Hey, thanks so much. And Gary, thanks for having me. Really excited to participate on the panel here with Gary, Deven and Khaled.

WireWheel is a privacy platform and we have offerings that include privacy program management, an automation software platform around assessments, driving privacy impact assessments, data protection impact assessments. We have modules around vendor assessments, data discovery and even infrastructure discovery in the public Cloud for privacy professionals. And our platform also offers a full subject rights automation platform, especially for those companies who are tackling the hardest problems under CCPA these days to build portals for their customers, collect and deliver that data back to their own consumers.

So I'm really excited to cover some of these issues with the panel today and to talk about where we're going, especially from a user perspective.
Dave:
Excellent. Thanks, Justin. And joining Justin and Gary on the panel, Deven McGraw is Chief Regulatory Officer at Ciitizen. Deven, welcome and can you tell us a little bit about your role?
Deven:
Sure. Thanks a lot, Dave and I'm really happy to be on this panel as well. Citizen is a consumer-facing platform that helps patients, beginning with cancer patients, gather, use and share all of their health information. We are located in California. But I think probably my value-add on this panel is that, for two and a half years, I was the Deputy Director for Health Information Privacy in the HHS Office for Civil Rights, which is the office where all of the policy and enforcement of HIPAA occurs. And even prior to joining that office, I had written about HIPAA de-identification, its value in terms of making data available for important secondary uses while still protecting privacy as well as concerns around whether the methodologies continued to sort of hold up in the 21st century.
Dave:
Excellent. Thanks so much, Deven. And to round out our panel today, Khaled El Emam is a professor at the University of Ottawa Research Institute. Khaled, it's wonderful to have you with us. Can you tell us a little bit about your role?
Khaled:
Sure. Yes. Thank you. So I'm a professor at the University of Ottawa and a faculty of medicine. I run a research lab at the Children's Hospital as well that's focused on health and technology to enable the sharing of health data and my main focus right now is on the generation of synthetic data and developing methods for data synthesis. I'm also a cofounder of a company that is developing software and tools to automate the data synthesis and synthetic data generation process.
Dave:
Fantastic. Thanks, Khaled. So as you can tell, we've got a breadth and depth of experience and knowledge here on the topic, so it's going to be a great conversation. And so, without any further delay, I'm going to turn it over to Gary to begin the program.

Gary, it's all yours.
Gary:
Dave, thank you very much. And I want to make something very clear to everyone. In our preparation for this webinar, we decided that we wanted a deck that people could refer to as a resource on its own. So there's going to be a number of very text-heavy slides that we may quickly identify. So you know they will be in the deck and available for your later use, but it is not our intent to go through each one of these slides in detail.

So again, the goal here is to have a very informative and hopefully, interactive webinar and then have a robust Q&A session and what you'll get as participants is not only the benefit over the discussion here today, but the deck itself. And so, I just want to make that clear.
[00:05:24]
So this webinar is different from a lot of webinars or other resources on the CCPA because they're covering other issues that are very important. But what this webinar really is about is, how do you balance protection and innovation? And sometimes, the answer is not necessarily what you might think.

Now, the GDPR has been around longer than the CCPA, but even there, many people talk about GDPR 2.0 because people are now focusing more on secondary processing, repurposing the data and even with the CCPA, while the law is out, it will be years until there's actual litigation and final resolution of disputes that some of these matters will be fundamentally and finally resolved. And we're all waiting for the Attorney General's implementation guidelines to have further clarity.

So what this webinar is about is based on what we know today and based on the state of the art in technology and the state of the art in legal knowledge and what we have learned from the GDPR to the extent that that's applicable, how do we maximize the lawful and ethical value of data while complying with the rights of consumers and data subjects?

And part of the challenge here that we want to address is, there's a lot of situations where consent by itself just is not enough. And a lot of the potential approaches that people take, okay? A lot of times, you hear people say, “Well, I'm going to anonymize my data so I can use it,” or “I'm going to anonymize my data so it's outside the scope of the GDPR.”

The reality is, statutes like the GDPR and the CCPA have and we'll touch upon these specific statutory benefits if you pseudonymised data.

Yes, it continues to be covered personal information under the CCPA and personal data under the GDPR, but you actually, in many instances, can make greater use of the data by keeping it within that categorization and taking advantage of the technology. And that's something a lot of people don't get.

So the key significance of this slide is that by realizing both the requirements for these technical processes, the identification and pseudonymisation and the differences, particularly under the CCPA of de-identification is defined under that statute versus HIPAA, can actually get you into the third column here. Where you don't want to be is the middle column, because those fish can't swim to each other and they only have so much oxygen. And that can be what happens today if you don't have the right technologies to enable lawful, distributed use, sharing, combining, et cetera. And you don't want to be in the far left because that's ancient history if it ever existed.

The goal we believe and the purpose for this webinar is to talk about, how do you allow your fish to swim in any bowl by complying with the requirements to make that swimming lawful?

So I'd just like to take a pause on this slide in case any of the panelists wanted to speak particularly on their viewpoint at this highest level, which is maximizing privacy-respectful, ethical and lawful innovation.

Anything anyone like to add before we move on?
[00:08:35]
And I will take that delay as a no. All right. So what's happened here? Originally, data use was siloed, okay? It's not that long ago when data was all analog. Then it became digital, which made siloed purpose-specific processing much more efficient. Then it became localized, and what I mean by localized is sharing and use of data between related parties under controlled conditions and amazing innovations and discoveries came from that.

But as we get into the world of big data, you actually have even further distributed use of data. And on the one side, the benefit is, there's even more value that you can create, right? That's one of the concepts of big data, right? The inferences and the interconnections and the linkages that you can see to spot trends, et cetera that you wouldn't otherwise. But those same trends, those same capabilities come at increased risk.

And so now, most of these laws are requiring a risk-based re-identification analysis, okay? Which includes not only the risk from the data that you and your immediate partners have access to, but the data that third parties might be able to add to that data. And so, this is the concept of both data use and therefore, data protection.

And I referred to it earlier, some of the earlier attention that was paid to the GDPR had more to do with primary processing. And Justin will speak to this later. The real focus now under both the GDPR and the CCPA is secondary processing.
[00:10:15]
And so, you can almost think of this as kind of a 1990s mindset, which was more localized versus a distributed mindset, which is more distributed and repurposing. And again, the difference is protecting data from what can be done with it using data external to your capabilities.
[00:10:35]
This is a slide and there's a couple in here that we're going to run through pretty quickly, but this is a slide that's helpful for focusing on the different types of data processing that are involved, because the different usages impact the law and what's involved.

So just very quickly, and again, these are slides you can go back to. This is a use case of a fictitious video rental company, Acme Video, and it shows the four principal types of processing.

Number one is I'm ingesting data. I'm sharing data that comes from third parties to augment my capabilities. In this instance, Acme Video may do this to import the IMDB database of what movies are selling well, but you can imagine a lot of different sources that people ingest or import externally and they want to use it for part of their processing. So that's number one.

Number two is primary processing. I sometimes call this, keeping the lights on, right? And so this is important. This is why you collected the data to begin with.

The transition from two to three is sometimes confusing to people, okay? This is when you start to think with the data, you are repurposing the data.

Secondary processing, oftentimes referred to as analytics, AI, machine learning. The simple example here in the Acme use case is, I need to be able to serve the movies to my customer, but when I want to actually go back and historically view the movies they've viewed in the past, and then compare that to the viewing patterns of similar customers and make a recommendation, that's secondary process right there. It is not primary processing.

And so, the move from primary to secondary processing can get a little slippery and it actually has more of an implication under the GDPR than the CCPA. But again, it's highly relevant. So two is primary, three is secondary, four is data sharing where you're taking the output of your processing and you're sharing that with a third party. That could be because somebody has a particular skill that you want them to help bring into your house.

So for example, third party analytics, AI, machine learning. But it could also be a co-op that you belong to or you could be commercializing that data. This could be adtech, okay?

So again, whether you're taking data from external parties and inputting it, one, that’s sharing. Whether you're taking data and sharing it externally to your organization, that’s sharing, that's four. Primary processing is two, that's what most people think of, but different legal issues can be raised when you get to repurposing of data.
[00:13:15]
And so, one of the things here, and I'd ask Justin to kind of pick up on this, is the distinction between these different types of processing were not as critical when consent, contract and anonymization worked, okay? But the issue is, consent oftentimes works, but it's not predictable and it oftentimes has potential negative ramifications.

It's not the purpose of this particular webinar, but consent has been viewed to be sometimes self-selective. So the dataset that you have as a result may not be representative of the overall market because certain types of people may be more apt to consent.

Also, if someone can revoke consent, then what dataset are you using day-to-day? Basically, under the CCPA, if some people, if you have 45 days to delete data that was properly requested for deletion, does that mean that every day you start with a new dataset that has been purged the day before? It's very disruptive.

Contract, again, particularly under the GDPR has been defined very, very narrowly. And so, the reality and anonymization, again, is not what it once was because it's much more difficult and Khaled and Deven will get into this, to anonymize data when you have to also take into account data that's available from third parties.

So it is in fact these different issues that actually then combine and as I transition to the next slide, where now, consent, contract perhaps don't work, okay? And anonymization doesn't work all the time.
[00:14:52]
Justin, if you can jump in and kind of give us your perspective on where we are in this ecosystem.
Justin:
Thanks so much, Gary. And I think from … these are really terrific overviews of sort of where we have been and where we are going. As we've all talked about in even preparing for this webinar, Gary, I originally went into the Obama Administration in around 2013, before GDPR and around the time of the Snowden disclosures.

You might remember at that point, there was a lot of focus on the national security side and I ended up working extensively over the next four years on issues with the European Union to put in place the EU-US privacy shield. And I led a lot of the outreach as companies were preparing for GDPR.

I completely agree with the way you framed the conversation. And now at WireWheel, we focused on building a software platform to have helped companies tackle first GDPR implementation and CCPA.

So I think the question you originally posed to me was, talk about how over the last four or five, six years, there's been a progression around how companies have been tackling GDPR and CCPA to use these two slides and focus on what's the next step, which is really all around use.

And so, in just a few moments, here's what I would say. If you were to go back just a few slides, most of the focus on GDPR implementation after it really went into effect, in a very European way, although GDPR obviously has a lot of provisions that govern use and data subject access.

In the first year of GDPR, my sense of it is from the community, is that one of the number one focuses was to make sure that companies had documented their assessments and especially their DPIAs if they had been exposed to GDPR.

So there's absolutely no doubt all of us in the community and I'm even looking at the hundreds of participants on the webinar today. And anytime you're in an IAPP webinar, you have folks who are effectively PhDs in privacy.

And so, although GDPR has a lot of provisions on subject access and has a lot of provisions on use restrictions and other provisions, my sense of it was that the first year of GDPR was really about data discovery, infrastructure discovery and really documenting PIAs and DPIAs because of the focus of the European regulators on that part of the exercise.

And so, you saw a lot of companies focused on both consulting services and technologies that allowed you to find where you were storing the data, find where you were processing the data, understand what personal data there was and get a basic understanding of with whom you were sharing that information inside your organization and out.

And that was very basic in many, many ways. And obviously, you saw a tremendous investment by a lot of companies in the first year after GDPR on doing that initial infrastructure discovery, that initial data discovery and documenting assessments at the highest levels.

As we were implementing with those companies, obviously, then you have the shift to move to the next slide around CCPA.
[00:18:57]
And technology is important at every step of the way. It's not only technology, but you need deep expertise from the privacy perspective to make these programs work. But what we really saw in the last year, helping very large and very small companies get ready for CCPA is that there was a very big move in the implementation side on privacy from focusing on privacy impact assessments and data protection impact assessments to the first requirements under version one of CCPA around data subject access to use the European term or subject rights automation.

We all know this, especially this community who's on the phone today, but the major focus in CCPA has not really been around assessments. It's been around individual rights and specifically, the individual rights of being able to access your data, delete your data, and opt out of the sale of your data with all of the vagueness that comes with the term, sale of data.

And so, in this last year and a half, we've seen a very significant investment on technology and on consulting to be able to set up that infrastructure to be able to take a subject access request or a subject rights request of access deletion or opt out, to be able to validate that the requestor is who they say they are and to be able to go and get that data from backend systems and safely and securely deliver that.

And to be honest, I still think we're early in the implementation of that particular set of challenges. If you look at a number of the articles out there, I'd say, a lot of companies started with a very lightweight way to take a request, and I'd say, a tremendous amount of the community over this next year is going to be maturing off their ability to help their own customers manage consent, preference management and their own subject rights under California.
[00:21:14]
But that takes us to, where is this going? So if you start that in the last four to five, six years, it started with finding your systems, documenting them, understanding the basic risks in those systems under GDPR. It moved to making sure that you have some basic infrastructure to safely manage rights for your customers and that technology element is so important there because otherwise, everything becomes a human element and that's where we've been focused. There's almost no doubt that the next phase of all of this is going to be under use, and that's where this Slide 10 becomes really important because both GDPR and all of these laws that are coming out around the country, whether you look at Washington's proposed law, New York and the one that I would put the most certainty that it’s going to pass is CCPA 2.0, the second version of CCPA is polling at this point at 85% to 90%. It's a ballot initiative, so we don't have to rely on a legislature to pass it. And if you look at the major requirements around CCPA 2.0, almost all of that focus is now moving from individual rights to use and being able to understand internal use, primary processing, secondary processing and data sharing and being able to show in an auditable way over time that you are actually using identifiable data in a way that's appropriate or taking really specific steps around pseudonymisation and anonymization before you use that data. So I'll stop there. Happy to come back to it, but I think you've seen a real sort of set up progress from 2013 on around documenting your systems and doing basic assessments, enabling individual rights. But where this is all going over the next few years is really going to be around making sure internal, primary, secondary and data sharing use is either appropriate using identifiable data or understanding some of the technologies you can bring to bear to be able to have anonymized or pseudonymised versions of the data.
Gary:
Justin, thank you very much for that. And I do want to let all participants on the webinar know, we're going to follow this webinar up with a summary of everything that’s covered. Justin, you just gave a great textbook example of the movement from a primary focus on data subject or consumer rights versus data use. And the first half, and so, that will be covered and we'll send it around to everyone who's registered as a summary of points. And again, thank you for that very eloquent and yet cogent description of it and what that actually highlights is the first half of this webinar. When consent doesn't work and why is it that consent doesn't work? And it's typically at the interactions, the intersections of these different uses that are here. Consent is typically, and I am making generalizations, focused on primary processing. It's difficult to get and can be difficult to predict when you have additional uses.
[00:24:43]
So let's move into that. This is just a quote from a couple of different things. I'm not going to go into each of the different quotes, but there's a lot of people who acknowledge that consent by itself has limitations, something you can refer to in the slides when you get them.
[00:24:59]
This is a slide that just highlights some of the issues with consent. I don't want to go into too much detail here, but many organizations realize that first off, they have to have predictability of operations, right? Compliance with laws is not enough if they can't continue to run their business.

So the consent-only based usage of the data can be very disruptive. That's all I'll say there. Legal requirements, okay? In some jurisdictions, you have to have consent. In other jurisdictions, the requirements for consent are so narrow, you could never get a compliant consent for something that is going to be an iterative analytic process because you can't describe it sufficiently in advance.

Consent also, there's all kinds of ethical issues, right? What's legal is not at the same level as what's ethical, okay? Self-selection impact, I touched upon this before. There's actually a negative, depending on what your data use is, that if you only take consent, have you self-selected something that's not representative of the data that you want?

Also, and this is an interesting one. If you're relying on consent, you have potential liability from your data sharing partners that you take data from. If they did it wrong, okay? Under the CCPA, if you're in possession of data for which the consent was inaccurately or inadequately secured and/or the data subject, the consumer exercise, the right for termination and that wasn't carried over to you, you could have liability from data that you took.

Conversely, and another thing is, there's exceptions under the CCPA. You're not subject to it if you're of a certain size, right? If you're not transacting business with a minimum number of California consumers or doing a certain threshold amount, but if you take data from a number of sources, each of which are exempted, therefore, they didn't have to comply with the CCPA. And in aggregating that data, you cross the thresholds yet again, potential liability.

And the converse is true. When you start to share that data with third parties, you could be subjecting them to liability. So the point I'm trying to get at here is having a consent plus model where de-identification, pseudonymisation actually helps you make greater use of that data in a predictable way is something to seriously consider for operational purposes.
[00:27:21]
So I'm not going to go into each of these. Again, these are here as a resource for you, but it's important to realize that sale under the CCPA, I would argue, and I'm sure there's other people on the other side, really means sharing with consideration and the consideration perhaps could be very loosely defined. We will not know for certain until lawsuits are finally resolved and/or we have further guidance from the Attorney General. But looking at sale under the CCPA in a very narrow sense, is a very high-risk approach.
[00:27:54]
Also, personal information as defined under the CCPA. There's a concept that is included under personal information of a probabilistic identifier. And this is really interesting.

So basically, if you replace someone's name with a static identifier, a lot of people think that's anonymous, right? I'm not saying Gary LaFever, I'm saying ABCD, but every time that Gary LaFever as a direct identifier would appear, if I replaced that with ABCD, that actually is a direct identifier.

So what is meant by anonymization has to be seriously considered because that term will not take you outside of personal information under the CCPA if it includes probabilistic identifiers.
[00:28:42]
And here's just a quick example. This is the famous Latanya Sweeney and if I go to the Department of Census and I ask for a copy of US citizens as of the last census, I can ask one for birthday. I can ask one for zip code. I can ask one for gender, and within each of those, they will be “anonymous” because no one's name is represented. But if my name is represented with ABCD in those three, my identity together with, depending on who you talk to, up to 87% of the US population can be identified by name.

So anonymous does not mean de-identified, ironically. And the issue comes about. That's if I get all those three datasets from the one party. Party A here being Department of Census, but I actually can have different datasets that are distributed to different parties. So no one party holds the three different datasets. Because of the way that CCPA and GDPR approaches what these terms mean, anonymization would require that those three parties cannot get together and combine those datasets if the potential of the combination is reasonable.
[00:29:53]
And this addresses an issue that's actually in these different laws. Something I tried to get out earlier. I'm going to use a European term of functional separation. And the reason I'm going to use it because different laws call it different things. The CCPA actually has pseudonymisation in one respect for research, but it also uses the identification.

Other laws use different terms. To really make things confusing, the Brazil law uses the term anonymization, but they don't mean by true anonymization. But what all these have in common is this is not encrypting data when it's at rest. This is not encrypting data in transmission. It's what Justin said.

How are you protecting the data in use? And if you can protect the data in use against unauthorized re-identification via what often is referred to as the mosaic effect, you have greater rights.

And so, it's for all these reasons that consent may not work. So I just want to pause here and see if anyone has any comments on the panel about the need for something to buttress consent and why relying just on consent is not as simple as it sounds.

So with that, actually, since we haven't yet heard from Deven or Khaled, Deven, do you have any comments on this, the need for something over and above consent?
Deven:
Yeah. I mean, all privacy law anywhere is, consent is not the sine qua non, right? There have always been exceptions to where consent is needed. Researches typically, medical research, research in the public interest is typically one of those exceptions. So it's pretty well-recognized. I think the relevant question when you sort of look at HIPAA juxtaposed against CCPA and think in particular about research that implicates health and wellness is whether the sort of traditional tools that we have built and relied on upon HIPAA are going to be as effective with respect to CCPA given the differences and we can hold off on diving into the detail of that until we get to those slides which are coming.
Gary:
Yeah. And we're about to hit that. So thank you, Deven. Without yet getting into the differences, Khaled, did you have any comments on the need for something more than just consent and why that's important and what I'd love for you to speak on, it's both why it's important to the individual as well as why it's important to the organization that wants to process the data and society generally.

So if you could give us your perspective on why consent alone may not get us where we want and need to be.
Khaled:
Sure. So I mean, one of the things I wanted to highlight just to elaborate on points you raised is that there's actually quite a lot of evidence of consent bias. And in particular, in the health sector. There have been many studies and many systematic reviews demonstrating that consenters and non-consenters differ on important characteristics. So you essentially end up with a biased dataset when you consent or try to re-consent individuals.

And in some cases, there are no alternatives. For example, if you're doing a clinical trial, then you're going to inject patients with the new compounds. You have to obtain their consent. But for secondary uses of data that already exists, if you have options, then this should be considered because otherwise, there’s a pretty high likelihood that you'll end up with a biased dataset.

So that's really important to understand. The other thing I think that hasn't been done much is looking at the benefits side. I mean, we keep talking about the risks, what are the risks to individuals, but we should also consider as part of this balancing act is, what are the benefits to individuals from having their data used or benefits to society?

And again, in the context of using data for health analytics and health research, whether it's academic or commercial, there are many potential benefits. And arguably, they should also be taken into account when deciding what's acceptable in terms of protecting the rights of the individuals.

And I don't think we've been doing … in terms of the narrative that's been going on, the benefits side of the equation is often not emphasized enough.
[00:34:28]
Gary:
I could not agree more. And so, here's the slide on Pseudonymisation. And it's interesting how many companies when they hear the term pseudonymisation in the context of the GDPR, they say, “Well, I don't want that because it's still subject to the GDPR.” But the reality is, and going to what Khaled just mentioned, when you're talking about the benefits to the individual as well as to society, if you can satisfy the requirements of pseudonymisation and the definition is literally identical under the GDPR and the CCPA, you actually get both the benefits of obscuring the linkage over that top wall back and forth between information value and identity. But you haven't given up the capability under controlled authorized conditions to actually do the relinking.

And that can be a significant benefit to the individual when it's authorized, when it's permissible. So that's exactly what we're talking about here, are technologies that can actually increase the benefits to the individual as well as to the organization that's doing the processing as well as to society.
[00:35:30]
So I just want to very quickly touch upon the next two slides. A lot of people are surprised by this. The term pseudonymisation is cited 13 times in the GDPR, and it has expressed statutory benefits of use.

Just to give you one, the right to portability does not apply if you're processing data under the GDPR using legitimate interest as your legal basis supported with pseudonymisation. That's just one. There are very many express benefits to pseudonymisation.
[00:36:01]
And under the CCPA. The same is true for de-identification.

And so, the first part of this webinar that we're now bringing to closure was, when consent doesn't work. And the reason consent doesn't work, maybe any one or a combination of the reasons that we've noted. And so rather than spending more time on that, and again, the slides have more information and also, please ask questions during the webinar and afterwards, feel free to ask questions as well. We'll give you an email at the end of this.

But the point I want to make here before we transition over as to the differences between DID or de-identification under HIPAA and the CCPA is there are express statutory benefits of using these techniques if used correctly.
[00:36:47]
Okay. So we're moving into the second half of the webinar now. And these are some quotes and you always have to be careful about quotes because they're accurate at the time of the quote. And with laws as dynamic as the CCPA, you also have to be cognizant of changes that have been made.

So one of the panelists pointed out that the quote on the back, on the far left side actually is now outdated because of changes that had been made to the CCPA. But I will summarize and hand over to Deven first and then Khaled the quote that's at the bottom. And I really think that's the quote that we're wrestling with here, right?

HIPAA’s de-identification rules were determined decades ago when primary uses were localized. Now, many of those uses are distributed and global. And so, the question is whether or not the tools that worked under HIPAA are the tools that a forward-thinking law like the CCPA should include.

So the next six, seven slides have a lot of detailed regulatory language. So I'm going to go to those slides only if Deven or Khaled would like me to, so I'm not drawing every one through seven text-heavy slides. But Deven, if you would like to start, and if you need me to go to any of those slides, I will, and then we'll hand it over to Khaled.
Deven:
Okay, great. Yeah. The text-heavy slides. Those are definitely for resource.
[00:38:11]
But you'll be glad you have them in your deck as you really dive into this. I want to say a few things about the juxtaposition of the CCPA and HIPAA. If you are potentially subject to the CCPA, because when you're a HIPAA-covered entity or a HIPAA business associate, you ideally want to be able to rely on the exemptions.

I think probably everyone on the phone call by now is aware that there are exemptions in the CCPA involving HIPAA-covered entities and business associates. But they’re important to remember. These are not entity-wide exceptions. They are exceptions for PHI, for information that qualifies as PHI.

If you have been routinely using de-identified data and enjoying its use for a broad spectrum of purposes, research, commercialization, et cetera because it's de-identified and no longer regulated by HIPAA, you've now got to think about whether, because it's not PHI anymore, it doesn't enjoy that HIPAA exemption under the CCPA.

So then you've got to determine whether your HIPAA de-identified data are going to meet the CCPA de-identification standard. And I think this is really tricky, actually.

HIPAA’s requirement for de-identification, and Gary is on the precise right slide here, which is Slide 21 but the legal definition is that there's no probability, reasonable likelihood that you're going to be able to identify an individual.

So it's really tagged to, can you identify this individual? But in contrast, the definition of de-identification under CCPA, which we've got on Slide 25, it's not about identifying an individual, but it’s about actually whether you can associate data with an individual or whether you can link data to an individual which suggests to me that the common way that we utilize de-identification in the healthcare space, particularly around epidemiological and other research is to still be able to link people longitudinally across databases, but without knowing exactly who they are.

And I think that this is an issue that calls into question whether if you've de-identified HIPAA in accordance with the HIPAA standards that you've necessarily met the definition for the CCPA. And if you don't meet that CCPA definition and you are otherwise covered by the CCPA, you potentially have an issue where you're going to have to try to meet the CCPA definition of de-identification, which I think in the healthcare space is not customarily the approach that we have taken.

Again, I'm not obviously the final word on the intersection of this language, but it certainly gives me pause in terms of whether or not HIPAA’s de-identification methodologies will get you there.

And if they don't, then you may even be better off keeping your data as PHI and thinking about other ways to use disclosure control techniques that might assist you with getting a waiver of consent from an IRB, for example or using a limited dataset, which that data would still be considered to be PHI and therefore, exempt from the CCPA.

I'll pause there.
Gary:
No, that was great, Deven. And I liked the way you ended up on there. It seems to me, and then I'll hand it over to Khaled. It seems to me, there's really three options for a company in that situation, or a covered entity.

The first is to, as you said, take different steps to maintain it as PHI, right? So that it benefits from the exception. The second is to have their approach, and I think this is something that Khaled is going to talk about. If in fact you're using the expert determination, your HIPAA de-identification could possibly satisfy the CCPA.

So the first is, keep it PHI and keep under the exemption. The second is, evaluate whether or not your de-identification technique and approach satisfies the CCPA. But the third one actually is, there are technologies, right? We're a technology provider that provides it, that can satisfy the heightened requirements of de-identification under the CCPA and that's the third.

So you either keep it PHI, stay within the exception. Secondly, look and evaluate whether your approach to de-identification to HIPAA actually meets the standards of CCPA or upgrade your de-identification to satisfy CCPA, which by definition would satisfy HIPAA.

Khaled, if you could pick up from there, I'd very much appreciate your perspective on this.
Khaled:
Sure. Absolutely. So just to comment on Deven’s point, I think it would be quite unfortunate if the solution is number one, where there are disincentives for applying disclosure control techniques and privacy-enhancing technologies and we revert to using PHI.

I don't think that was the intention of the drafters of the CCPA is to encourage the proliferation of use and disclosure of personally identifying information. So hopefully, that's not where we end up. But what I generally like to do is to play out the requirements to health research and using health data for secondary purposes for, again, academic and commercial uses of health data and to see whether the rules or the requirements would enable that.

And there are really two key things that any methodology would need to satisfy. You need to have longitudinal data. So you need to be able to link non-identifiable records that belong to individuals over time, because without longitudinal data, you really are very limited in what you can do from an analytics and AI and machine learning perspective.

And then the other thing is, you have to be able to build models and draw inferences and conclusions about groups of people. And by implication, if you're drawing conclusions about groups, you're also drawing conclusions or inferences about individuals on those groups because that's the basis of statistics. You're building models and you're drawing inferences about groups of people who have certain characteristics and making predictions or characterizing them in some way.

So these are two fundamental requirements. And if any disclosure limitation method and any regulation limits that, then that's a severe limitation on the ability to use health data.

So I think that in general, there are three privacy-enhancing technologies that can be applied. There's stigmatization, there's de-identification and there's synthetic data. From the HIPAA perspective for de-identification, we have two standards which are on the screen now.

Safe Harbor. I think in general, it's not a very strong standard for de-identification. It has not really … it was a good standard at the time when it came out. But I think over time, its weaknesses have become obvious and I generally would not recommend using Safe Harbor.

So let's talk about Expert Determination. Expert Determination is very flexible and it essentially requires you to use best available practices at the time. So from that perspective, it's quite flexible and it doesn't necessarily lock you into using a particular methodology. And all three methodologies or three paths that I mentioned, stigmatization, de-identification and synthetic data can fall under the umbrella of Expert Determination.

And I acknowledge that expectations are getting stricter and arguably, what has been deemed acceptable some years ago may no longer be deemed acceptable. So there's a need to continue to innovate and come up with better methods that are more protective.

And I think some of the modern methodologies or recent methods for de-identification would meet the purposes of the CCPA. Disclosure control methodologies, but also at the same time, enabled longitudinal data use and inferences from data and data synthesis as well, because with synthetic data, there's no one-to-one mapping between a synthetic record and a real person. And therefore, that whole linkability problem is solved from the outset.

So a number of different approaches that can be used here. And again, as long as these two requirements are longitudinal data and inferences are met, then we have data that's actually useful for secondary analysis.

So I'll pause here and then see if anyone else has any comments or any questions on these points.
[00:47:35]
Gary:
Yeah, because I would like to get to questions. Does any panelist have anything that they want to cover quickly before we open it up to questions?
[00:47:46]
Okay. I want to point out a couple of things. First off, if you asked a question and it’s not answered during the webinar, we will do our best to answer within the next couple of weeks. So please don't feel that if you asked a question and wasn't addressed, it's not going to be covered.

Also, if you have follow-up questions and you think about following the event, you can actually send those to questions@anonos.com. So that's questions@anonos.com. We also want to make everybody aware that there's an IAPP conference, webinar conference next week on legitimate interest processing under the GDPR that has different people, has somebody from Privacy International and someone who used to be with the Italian Garante, has much of a European orientation. But it's similar topics.

Also, Khaled gave a fantastic webinar just yesterday on de-identification. That's available at replica-analytics.com/knowledgebase. And if you have any questions on pseudonymisation, you can go to www.pseudonymisation.com or anonos.com.

So with that, let's open it up to questions now for those on the webinar and see what we get.

Oh, by the way, just to give you an indication, we had over 500 people registered for this webinar. So very high level of interest and the legitimate interest webinar has over a thousand already. So this is a very, very topical point and could not be more appreciative of the panelists in giving us their time. So let's take advantage of that now and take questions from the audience.
Dave:
Yes. Thanks, Gary. This is Dave from IAPP coming back on the line. Appreciate that. And so we do have some questions from the audience. Before we get started with those, let me remind all of you that the way to handle questions is to submit them via the field that's just to the right of the PowerPoint window. There is an open field there. You can type your questions right in and submit them to us. They will be anonymous. So please go ahead and forward those over to us.

So now, let's get started. We do have a few in the queue here. And to start, here's one, “For sale, would a scenario where the data is transferred to a third party service provider to analyze/process the data and deliver the data back be clearly not a sale or in the gray area?”

A question, I'm sure, many are wondering about. Gary, let me throw that to you first and then we'll see what others have to say about that as well.
Gary:
Yes. So we are waiting for guidance from the Attorney General. There are certain exemptions for service providers that are providing work on your behalf. But I think we need further clarification.

It's a great question because it highlights, sale does not just mean exchange of data for cash, okay? And there's a number of good quotes and stuff within the deck itself. So my answer would be, it's unclear and we hope to have further guidance from the Attorney General.
Gary:
Good point.
Deven:
For why you don't think it's a sale. And you actually might be able to use some of the HIPAA language around what constitutes a sale of data, which is really very clear that payment for services, even when data is exchanged as part of those services, does not constitute a sale. That was part of the omnibus regulation out of the HITECH legislation in 2013, I think.

So this is a hint. Look, it’s bootstrapping, right? But you're pointing to another plausible legal source for why this shouldn't be considered to be a sale in addition to using CCPA sites.
Gary:
That also raises another point and that's the difference between what's lawfully required versus what's ethically expected from the customer base or the consumer base. And so, we won't know that for a while, right? So there may be a technical out under the law. You also have to ask yourself, what is my customer base going to think when they find that I'm still doing it?

So that's why it's going to take a couple of years to completely clear up, right? I think part of the broader and why the IAPP audience is so powerful is data governance, right? It’s also business governance. What makes most sense which has to be aware of what's legally required but then we also have to put on our business hats and think, how is that going to be perceived in the marketplace?

So great question.
Dave:
Yeah. And Deven, I really think …
Justin:
The one other thing I might add, this is Justin, is that it's pretty clear that there are certain service provider functions that are secondary processing as you covered it earlier, that are very likely to be treated as a sale. So one example of that is, if you have an opt out of the sale of data, [inaudible] CCPA and you transfer data to somebody to surface an ad and the ad provider is using that personal data for things like contextualization or for any other purpose, it's pretty clear that the AG is going to be treating that kind of use when there's an opt out of the sale of data as a sale.

So I agree with you there. There is some revision coming out here, I think in the final leg, but I'd say, some of the secondary processing that we all take for granted that's involved in MarTech and the contextual ads surfacing right now, are pretty clearly going to be viewed as a sale under the California law.
Gary:
Great discussion. Anyone else on that?
Dave:
That was terrific, Justin and I think Deven, some pretty sage advice from you there in terms of practical business approach. And that is, document your rationale and chose a thoughtful approach and due diligence at the very least. So that's really good advice.

Let's go ahead and move on. We just have a couple of minutes left here. We have time for a couple more questions. Gary, this one is for you. It goes back to something that you covered very early on in the web conference and that is, “Why are there different risks between local and distributed processing?”
Gary:
Yes. And so, here's the point and this very much goes to how data use has evolved over the years. And it also goes to, in the expert determination under HIPAA, part of the issue of assessing what controls are necessary is, who is the intended recipients and users of the data, right?

And if I can control the people who have access to that data, I can, on a risk basis say, I don't need a stringent of protections. Once that data starts to be used in a more distributed fashion, you have to be able to protect against the re-linking and re-identification, unacceptable longitudinal analysis, because it wasn't intended. It wasn't authorized.

And so, as you have more distributed data use, you need higher levels of protections that protect the data in those distributed use cases. And so, in essence, the newer laws, the CCPA, the GDPR are there to allow this broader data usage, but it requires a higher level of protection. You can always go back to just enclave processing, but then you have to be careful. Let’s not forget about data breaches.

And when your data is breached, the person who did it did not agree to not misuse the data. That's another reason why de-identification, pseudonymisation as a security solution is advantageous because if the data you're processing has been de-identified, okay? There's less risk to the consumer or the data subject if it's breached.
Dave:
Terrific. And I think we have time for just about one last question. Gary, I’m actually going to stay with you for this one. Someone is asking about repeating the three options. There was taking steps to stay within the exception, evaluate whether the de-identification satisfies the CCPA and then there was a third one. Would you mind just covering those again?
Gary:
Sure. And I really liked that interaction and it's a great question. So what came out were three possible approaches and this is healthcare-specific, okay? The exception for HIPAA is PHI. So if you're processing protected health information, there's an exception under the CCPA. I should point out that same thing is true for GLBA and the Fair Credit Reporting Act. But in all three instances, it's not a blanket exception, okay?

So the exception for Fair Credit Reporting Act does not mean banks aren't covered. It's the same thing. The kinds of data that are required under the FCRA are exempted, but the rest of the data that's usually collected in connection with a credit report are not. So it's very similar. It is the key elements of data.

So the three options that we talked about were, one, keep it within the exception by keeping it as PHI, not protecting the data. Now, what that means, tying it back to the prior question, is you can only have localized processing, okay? You can't process PHI on a broad distributed basis, because what I mean by the term broad distributed basis is one where you can't control the recipients, okay?

So the first one is keep it within the exception by not protecting the data with technology and having other types of protections in place. So it remains PHI and therefore, excepted.

The second one, okay? Was to evaluate whether or not your current HIPAA de-identification approaches satisfy CCPA. And this is going to go into whether they're risk-based, how advanced they are. It's very unlikely that the Safe Harbor would work. But an expert determination approach that actually is risk-based combined with some security controls may be adequate.

So option number two, assess your current de-identification under HIPAA and see if it's compliant. And option number three, is there are, as Khaled mentioned, advanced technologies, right?

Anonos, my company, Khaled's company actually works in that space. There is, and there are ongoing advances in de-identification technologies that could enable you to satisfy the heightened CCPA requirements, and therefore, you would definitely satisfy the HIPAA requirements.

So those are the three. One, keep within the exemption because you don't protect with the technologies, you have other approaches in place. Two, evaluate whether your current HIPAA de-identification satisfies CCPA or three, upgrade the CCPA-level de-identification and therefore, satisfy both the state statute as well as the federal legislation.
Dave:
Terrific. Thanks so much, Gary. And unfortunately, we're just about out of time here, but as Gary mentioned earlier, if you did submit some questions and we weren't able to tackle them on this program, we're going to try to answer them post-program. So look for a follow-up on that.

Also, as you can see in front of you, everyone's email addresses are there and we've also got questions@anonos.com address where you can submit further questions. So please do get those into us so we can see if we can tackle some of those.
[00:59:05]
So before we drop off, if you're still on the line with us, we would love to hear some feedback from you. There's a live link in front of you. If you're listening to this webinar live and you can head on over there, pull up a browser tab and answer a few quick questions for us that we've timed. It takes literally two minutes.

Let us know how you enjoyed this program, what we can do to improve and importantly, there's a field in there where we can accept your input on topics that we might cover on further privacy education programs.
[00:59:34]
A huge thank you to Anonos for providing the support for this program, making it free to all of you and for hosting it today and getting this great panel together, this great topic for us. So we here at IAPP want to give a great thank you to Anonos. And thanks to all of you for joining on the panel today. We really appreciate that. Justin and Khaled and Deven for joining us with your expertise. We really appreciate it. Thank you.
[01:00:00]
And then, as I mentioned at the beginning of the program, if you're an IAPP-certified privacy professional and you registered for this through the IAPP website, we're going to automatically grant you one CPE credit. You don't have to do anything. That's just going to be taken care of for you.

And if for some reason, you're listening and you would like to receive that CP credit, but you didn't actually register through the website, which is fine, there's a certification tab on our website where you can fill out a very easy to fill out form and receive that credit.

Also, if you're an attorney and you're wondering about continuing legal education credits or CLEs, we don't pre-certify these. So you'll need to apply to your particular jurisdiction for that.
[01:00:40]
But they're often eligible. So if you want to reach out to me, if you need supporting materials, please feel free to do so. My email is easy. It's dave@iapp.org. Feel free to reach out to me via email or give me a call and I'd be happy to see what I can do to help.

So with that, thanks one last time, everybody for joining us today. Hope you enjoyed the program and hope to see you on another program soon.

Take care. And with that, I'll take us to a program close.
 
CLICK TO VIEW CURRENT NEWS



Are you facing any of these 4 problems with data?

You need a solution that removes the impediments to achieving speed to insight, lawfully & ethically

Roadblocks
to Insight
Are you unable to get desired business outcomes from your data within critical time frames? 53% of CDOs cannot achieve their desired uses of data. Are you one of them?
Lack of
Access
Do you have trouble getting access to the third-party data that you need to maximise the value of your data assets? Are third-parties and partners you work with worried about liability, or disruption of their operations?
Inability to
Process
Are you unable to process data due to limitations imposed by internal or external parties? Do they have concerns about your ability to control data use, sharing or combining?
Unlawful
Activity
Are you unable to defend the lawfulness of your current data processing activities, or data processing you have done in the past?
THE PROBLEM
Traditional privacy technologies focus on protecting data by putting it in “cages,” “containers,” or limiting use to centralised processing only. This limitation is done without considering the context of what the desired data use will be, including decentralised data sharing and combining. These approaches are based on decades-old, limited-use perspectives on data protection that severely minimise the kinds of data uses that remain available after controls have been applied. On the other hand, many other new data-use technologies focus on delivering desired business outcomes without considering that roadblocks may exist, such as those noted in the four problems above.
THE SOLUTION
Anonos technology allows data to be accessed and processed in line with desired business outcomes (including sharing and combining data) with full awareness of, and the ability to remove, potential roadblocks.