We’ve recently been talking here about problems with poor study design in clinical trials. A health IT version of this problem raced through the newswires this week while I was on the road. The news coverage was particularly naïve, illustrating our point.
I’ll say at the outset that I haven’t corresponded with the study’s authors, and I’d welcome constructive dialog. I find myself frustrated, and I’ll lay out my reasons, open to correction.
Researchers at Stanford did a retrospective analysis of patient records 2005-2007 (article here) and concluded that quality of care was no better in institutions that used electronic medical records. Journalists who don’t know how to assess study design – or even look beyond the abstract of an article – were sucked in and posted headlines like Stanford researchers find EHRs don’t boost care quality.
Before we analyze, please reflect: what’s your impression upon reading those words? What does that headline lead you to think about the push to computerize healthcare?
It’s hard to know where to start in cataloging the ways these headlines give an impression that’s way off base. For starters, I wondered what they meant by “quality of healthcare.” What did they measure? Did they assess how well the patients turned out? How few complications they had, how few side effects? To me those are pretty important quality measures.
Well, no, they only measured whether the doctors prescribed the right treatment (or medication).
So it seems to me the headline should say, “Stanford researchers find EHRs don’t change what doctors prescribe.” Very different impression.
But what do I know? I do know that in healthcare a lot of people think the right prescription – the so-called “standard of care” – is the definition of quality. And indeed, that’s how this study defined quality. (Did you know patients only receive the standard of care a bit more than half the time? Indeed, installing EHRs wouldn’t change that. Methinks that problem has nothing to do with computers.)
A deeper issue is the question of who gets to define what quality is. In every transformed industry, the customer does, but in this one, they measure whether one professional did what another one said to.
But let’s stick to the issues of the research itself: social media to the rescue. On Twitter that day, health IT guru Brian Ahier steered us to this terrific analysis by Dr. Bill Hersh of Oregon Health & Science University. Bill catalogs the study’s limitations in clear English. A few excerpts: (Bill, I’m going to rip off chunks of your post, to reach lazy readers, in the hope that serious ones will click through for your whole great post.)
Like almost all science that gets reported in the general media, there is more to this study than what is described in the headlines and news reports. …
There is no reason to believe that the results obtained do not derive from the methods used…. However, there are serious limitations to this type of study and to the data resources used to answer the researchers’ question, which was whether ambulatory EHRs that include clinical decision support (CDS) lead to improved quality of medical care delivered.
“The data resources used” resonates with our posts this week about whether a study actually measures what it set out to measure.
…it is important to understand some serious limitations in these types of studies and this one in particular. A first limitation is that the study looks at correlation, which does not mean causality. This was an observational and not an experimental study. … As with any correlational study, there may be confounders that cause the correlation or lack of it.
The best study design to assess causality is an experimental randomized controlled trial. Indeed, such studies have been done and many have found that EHRs do lead to improvements in quality of care….
See Bill’s post for more. Then:
A second limitation of this study is the quality measures used. Quality measures are of two general types, process and outcome. Process measures look at what was done, such as ordering a certain test or prescribing a specific treatment. Outcome measures look at the actual clinical outcomes of the patient…
[In healthcare today] most measures used… are process measures that may or may not result in improved patient outcomes.
I guess that matches my initial gut reaction, above. (They looked at a process measure (doing the right thing) rather than how the patient turned out. Quite a different concept of quality.)
A third limitation … is that we do not know whether the physicians … had decision support [software] in place.
If I read this right it means the researchers evaluated whether docs made the right decision, even though they didn’t have data on whether the systems in use had decision support installed.
Please do read Bill’s post, and the original article, to fully understand the situation. My concern here is twofold:
- Before interpreting any study, find out what you can about how the study was done. As in this case, you might easily discover that they didn’t study what you’d think, based on the published conclusion.
- Be really careful about interpreting health news. Headlines are commonly off-base. In my opinion science writers ought to be spanked thoroughly for parroting a touted conclusion without at least looking as far as I did.
CNN’s Sanjay Gupta was one example: Electronic Health Records No Cure-All. After blindly accepting the published conclusion Gupta branches into a discussion of the Federal stimulus bill’s incentives to computerize healthcare, then cites skeptics who warn of overreliance on computers, because they might crash, etc. (I agree, don’t rely on unreliable crap of any sort! Certainly not mission-critical computer systems. Decades ago airline reservation systems used to crash sometimes; they fixed that. Engineering reliable systems isn’t rocket science; system buyers should ask for quality, just as patients should.)
As I said, I’d welcome dialog with the Stanford researchers. Hersh and I may be unaware of important factors. For the moment, I assert that the impression given by these headlines – and by the study’s published conclusion – is way, way off base: the 2005-2007 data didn’t at all give an indication of what the future holds. We don’t even know if the systems studied contain the feature that was measured.
Thanks again to Bill Hersh and to Brian Ahier for rapidly producing and sharing such great info.