If you hate HIPAA, it’s your lucky day. Paul Ohm is handing you ammunition in his article, “Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization.” His argument: our current information privacy structure is a house built on sand.
“Computer scientists…have demonstrated they can often ‘reidentify’ or ‘deanonymize’ individuals hidden in anonymized data with astonishing ease.”
Ohm’s article describes HIPAA, in particular, as a fig leaf – or worse, as kudzu choking off the free flow of information:
“[I]t is hard to imagine another privacy problem with such starkly presented benefits and costs. On the one hand, when medical researchers can freely trade information, they can develop treatments to ease human suffering and save lives. On the other hand, our medical secrets are among the most sensitive we hold.”
Indeed, one might reformulate that statement:
When e-patients can freely trade information (with fellow patients, with family members, with health professionals…), they can track symptoms, treatments, and outcomes that would otherwise go unobserved.
That’s the hope and the promise of participatory medicine. Yet there is a danger to all that health data floating around.
Ohm uses a haunting phrase to describe the possibility of re-identification: the database of ruin. It will reveal all our secrets to everyone, at any time, and follow us wherever we go (calm down, it doesn’t exist yet).
My take on his essential message is:
Fear the database of ruin, but don’t become paralyzed by it. Instead, work toward its prevention.
That call should be heard by everyone, not just those of us living with diagnoses we want to hide. Ohm argues that only people with absolutely no secrets and no connection to the modern world can live free of the threat of the database of ruin, but he delightfully calls them “the unicorns and mermaids of information privacy.” We live in glass houses and type at glass keyboards, people.
Another phrase that is sticking with me:
“Utility and privacy are, at bottom, two goals at war with one another.”
The more useful a data set, the less likely it is to be scrubbed of identifying information. Think about the implications. If we want useful data, we need to make trade-offs on what might be revealed in that data. Who should make those choices? E-patients? Health professionals? Regulators? Trade groups? What groups or types of data should get special treatment? (See: “Children and Population Biobanks” in Science, 14 August 2009: 818-819 – hat tip to Chris Hoofnagle)
Ohm focuses on a lawmaker’s conundrum: regulation of reidentification is “the latest example of the futility of attempting to foist privacy on an unappreciative citizenry.” Indeed, regulators might point to the millions of people flocking to MySpace and Facebook, or the thousands participating in even deeper personal experiments of data tracking, and ask, “Who am I to get in the way of all this sharing?” Ohm argues that this laissez-faire attitude would be irresponsible and I think e-patients should hear him out: “[T]oday’s petty indignity provides the key for unlocking tomorrow’s harmful secret.” In sum, Ohm’s article is a strong vote for data protection even as he eviscerates the current system.
You see, there is no such thing as “security through obscurity” when so many databases exist, containing all the clues someone might need to match your “25 Random Things About Me” with your search-term trail and, in turn, your financial or health records.
All of which leads us to this question:
“Once regulators choose to scrap the current HIPAA Privacy Rule – a necessary step given the rule’s intrinsic faith in deidentification—how should they instead protect databases full of sensitive symptoms, diagnoses, and treatments?”
Nobody is on the sidelines of this debate. Yes, your participation in an online health data-sharing site puts you at greater risk, but Ohm points out that “stored search queries often contain user-reported health symptoms” and indeed, Pew Internet research has consistently shown that 80% of internet users have looked for health information online and search is usually the first stop. Few people want to cut off access to the vital information found online, but what about the opportunities for advancement through data sharing?
Finally, as Jane Sarasohn-Kahn points out, “Americans feel dis-empowered when it comes to health information technology.” Frankly, most people don’t even know the half of what is going on in this debate — imagine how they would feel if they did!
So: If you care at all about health information technology: Read the article, form your own opinion, and get to work.
Great article by Ohm, great take on it by you, Susannah. REALLY puts a potent, relevant light on the subject.
When I first got exposed to all the deep geeky talk about HIPAA (in a meeting at the Center for Democracy and Technology – experts chatting informally about breach notifications, variations in laws, and all that), it was way over my head but I could at least say “As far as I can tell, ALL of these concerns are to protect people from having their data used against them, right?” Everyone nodded.
Now I see this in a whole different light: the data is NOT secure, not at all. So I think you’re exactly right: given that, what do we do?
I have to give credit to Jules Polonetsky, Co-chair and Director of the Future of Privacy Forum, who called Paul Ohm’s article “one of the most important papers of the year.” (Follow his tweets: http://twitter.com/JulesPolonetsky )
One of my goals in writing this is to bring together the two worlds of health mavens and privacy mavens. Just as we’ve urged statistical literacy and data literacy, I am urging privacy/security literacy.
Ohm’s article is a great place to start because it’s a page-turner — how can you resist a writer who refers to unicorns and mermaids?? This debate shouldn’t be limited to law professors and code geeks. Too much is at stake.
It is about time this topic gets more attention. Hopefully the folks at the VA are reading!
At first I read:
“Ohm uses a haunting phrase to describe the possibility of re-identification: the database of ruin. It will reveal all our secrets to everyone, at any time, and follow us wherever we go.”
…and I thought “Database of Ruin” sounds a bit Orwellian. But the reality is, as more and more data is collected it will be increasingly impossible to completely anonymize it.
But who would want to access to this identifiable roster?
THEY would need ALL the data and very SOPHISTICATED ALGORYTHYMS and trained HUMAN ANALYSIS to kludge it all together again…maybe some Computer Scientist structured data Subject Matter Experts can make it LOOK easy, but not so much. This would be a considerably difficult undertaking. Although it seems nearly impossible, the entire health industry and IT industry are moving in that direction, and with there funding and talented ability to avoid regulatory impediments, they’ll most likely get there.
I believe this is an inevitability due to the efficiencies and benefits of information sharing, especially in a field where basic, clinical, and longitudinal studies are very active, not to mention information intensive decision cycles in treatment.
That said, the debate should be elevated to determine how to manage the risks…how to know when it’s worth the danger of ‘being revealed’ for the benefits of informing researchers and care givers. It may be more productive than the polarizing ‘they’ who will build a ‘database of ruin’ to destroy humanity vs. ‘they’ who impede science to ‘hide’ their ‘gluttonous’ preventable illnesses.
This type of novel research in Risk Management will benefit several other industries which as a result of the Information Age (and globalization) are facing regulatory, policy, ethics, and poignantly ‘management’ debates surrounding information management. Many of these topics are classical legal and political issues, which have resurfaced in modernized forms, being revisited generally because IT advancements. A great case in point is digital signatures…many lawyers had to dust off the books when that rolled around in the early ’90s.
IMHO, it’s really a utility curve we should be working towards. A model which all stakeholders could leverage as means to inform and empower themselves in their individual roles (especially patients and regulators), and to hopefully optimize the Health Care Information Market.
The reality is that the database of ruin exists in parts at companies like BlueKai, Rapleaf, and Experian. It’s not just search logs that can be used to identify you, it is also web cookies, financial transaction histories, and e-commerce databases. Since the technology exists to circumvent any de-identification procedure that would yield data useful for medical discovery, the future lies in how well we plan out and manage the web of trust and confidentiality agreements.
Orwellian indeed! The word panopticon comes to mind, too. When I put out a few key points of this article on Twitter I got back another haunting phrase: “gossip biography” (from @EvidenceMatters).
Thanks so much for adding these insights. I love the idea of working towards a “utility curve.” Can lawyers, coders, health researchers, patients, etc. all contribute to getting there?
This essay was crossposted to The Health Care Blog where another discussion has cropped up, in case you want to check it out: