Four years ago this week, e-Patient Dave published, “Imagine someone had been managing your data, and then you looked,” and forever changed the national conversation about health data. I have described that post as an earthquake — a surprise to those who were not looking for signs and indicators of trouble, not a surprise to those who listen and learn from patients. If you’re new around here, read the post and skim the 175 (!!) comments to get an idea of its impact.

I’d like to honor the anniversary by raising some new questions about health data in this cross-post from my personal blog:

Who provides the fuel for the health data fire? Hint: Look in the mirror.

“If iron ore was the raw material that enriched the steel baron Andrew Carnegie in the Industrial Age, personal data is what fuels the barons of the Internet age.” – a line from Somini Sengupta’s article in the Sunday New York Times, “Letting Down Our Guard With Web Privacy.”

I think personal data is fueling health innovation, which is why I hope Sengupta’s article is widely read in the health world. Who are the barons in the new health care enterprise? Who are the serfs? What assumptions are being made and what choices do people have about their health data — and are they aware of them?

In the article, Sengupta profiles Alessandro Acquisti, a behavioral economist who sees himself as an “observer holding up a mirror to the flaws we cannot always see ourselves” (ditto, and in my view, research can also be a window).

What can we learn by looking in Acquisti’s mirror? An excerpt from the article:

Our browsing habits, search terms, e-mail communication — even our offering of our ZIP codes at the supermarket checkout — reveal bits of information that can be assembled by data companies, usually for the purpose of knowing what sorts of products we’re most likely to buy. The online advertising industry insists that the data is scrambled to make it impossible to identify individuals.

Mr. Acquisti offers a sobering counterpoint. In 2011, he took snapshots with a webcam of nearly 100 students on campus. Within minutes, he had identified about one-third of them using facial recognition software. In addition, for about a fourth of the subjects whom he could identify, he found out enough about them on Facebook to guess at least a portion of their Social Security numbers.

The point of the experiment was to show how easy it is to identify people from the rich trail of data they scatter around the Web, including seemingly harmless pictures. Facebook can be especially valuable for identity thieves, particularly when a user’s birth date is visible to the public.

Does that mean Facebook users should lie about their birthdays (and break Facebook’s terms of service)? Mr. Acquisti demurred. He would say only that there are “complex trade-offs” to be made.

Indeed. I have heard about complex trade-offs before — and I bet you have, too.

What would you trade for a chance to discover whether a new drug will work for you or your loved one? What would you trade for a chance to contribute to an experiment that could improve your life? What data would you share for the greater good, even if there was no direct benefit to you?

Use of aggregated — or highly-identifiable, personal — data is not all done with nefarious intent. It can, in fact, be life-saving. But who holds the power when you face those choices? What are the patterns of behavior for the old way of doing things? How can people learn how to contribute in new ways to health research? What would happen if patients were able to fact-check their records and improve the data fueling innovation?

Join me in reading the article and thinking through these issues. Check out additional resources related to the effects of sharing and exposing personal information, curated by William Gunn and others. Please post observations and questions in the comments.

Note: I don’t want to lose the great insights that Nick Dawson and e-Patient Dave already wrote in the comments on my personal blog, so I’m pasting them here:


My first thought, with articles like Ms. Sengupta’s, particularly when couched in the language of your first few paragraphs, is to be part of the voice heralding in the new age of consumerism. More data in the hands of patients, to me, signals a change in the traditional doctor patient relationship. Maybe we should start by calling it a patient doctor relationship.

But this is deeper. Sengupta’s article and your post point to something more meaningful and widespread: the acceptance and promotion of vulnerability as a positive trait. Giving up data also means relinquishing fears about what the data reveals. Arrest records, shopping habits, health data —often (not always) we guard those things against the judgement of others. What happens when we let down those guards in favor of feedback, scientific and societal advancement or shared decision making?


I love behavioral economics. Kahneman’s epic Thinking, Fast and Slow is mandatory reading, I think, for anyone who wants to fully grasp the gulf between what people honestly think they want and what they do. it’s not just stupidity or hypocrisy; Kahneman and his partner Tvesky formally established the mechanisms by which the mind loses its grip and does something other than it honestly believes it’s doing. Without comprehending this we have little chance of creating policies that have a snowball’s chance.

I’d hoped to post something about my own data post, to which you linked – it appeared four years ago today. Boy, how the world of health IT has changed since THEN: the ARRA/HITECH incentives had just come on the scene, the words “meaningful use” had barely been signed into law…

… And people were awfully naive about data, at least in health IT. Some were foolishly naive, unaware, clueless but acting confident; others were just charmingly innocent. As far as I can tell, that’s the WHOLE reason people were so shocked about my post: they were naive about things that are routine to data geeks who actually get their hands dirty with the stuff.

Here’s an anti-naivete tip: it’s a big mistake to take data created in one context and read it back in another. That’s what happened in my case four years ago: data recorded for insurance purposes was read back as clinical data. It gave a catastrophically wrong impression. And people were amazed.

If I can, then, take a different tack on your post: you mentioned data fueling innovation; yes, data is fuel, data is combustible. It can be powerful, but if it’s contaminated it can wreck your engine,and if it’s mishandled it can blow up in your face. In my famous case the data was both dirty and mishandled.

I hope I’m not stretching it when I say that like medicine itself, data can work miracles; but if we expect miracles just because it CAN do miracles, we are naive. We can get really disappointed, and when things go wrong people can get upset. Mix that with the illusions going on in our earnest minds, and it’s a setup for disconnect.