Before you read this post, think of a time when you had a crush on someone. Think about that swirl of emotions, the highs and the lows. That’s where I was a couple weeks ago, except it wasn’t about a person.
I fell hard for Watson, IBM’s hot new outboard brain. I’d heard he was smart, kind of a know-it-all, but also trainable, a good listener, and maybe the answer to people’s prayers and complaints about the rising flood of information.
At least, that’s what I hoped as I found a seat in Martin Kohn’s Health Foo session. (What’s Health Foo? Read this.)
Kohn began with an explanation about why Watson’s first public task was to compete – and win – on the game show Jeopardy. Playing Jeopardy was not a long-term business model, he said with a smile, but a means to an end: mastering a complex English-language challenge.
Kohn dazzled me and many others with Watson’s voracious appetite for information. My tweet, “Doctors have only an estimated 5 hrs of reading time per month; IBM’s Watson can read 72,000 hours per day” was an instant hit as people re-tweeted it and began firing back questions. Tim O’Reilly tweeted, “In the 3 seconds of a Jeopardy question, Watson could read and understand 200 million pages of text,” which set off another flurry of interest.
But I was too much in thrall to answer questions on Twitter. He (and yes, the developers call Watson “he” too) can dynamically evaluate the quality of information sources, ie, “The New York Times is more often my source for the right answer than Wikipedia, I’ll prioritize that source next time.” He also develops a confidence-level threshold that changes dynamically according to a risk assessment, such as the state of the Jeopardy game or the amount of money at stake.
Joy was welling up inside me as I heard Kohn explain that the health care version of Watson will never say, “This is the answer.” Instead he returns prioritized possibilities. Watson will be an enabler of the information-sorting process, not an instigator (which would answer most of the push-back questions pouring in via Twitter).
As Kohn put it, Watson is the friendly, diplomatic voice in your ear reminding you to think more broadly. Kohn compared Watson to the experienced nurses he worked with as a doctor-in-training, who quietly asked, “Have you considered X? Would you like me to start Y?”
Kohn also shared that one of IBM’s goals is to empower knowledgeable patients with information because they will then be more likely to understand and follow a care plan.
That was it. I had a crush. I was ready to introduce Watson to my friends, the next step in any courtship.
Luckily a big group of them was there with me. People started asking questions, such as, “Watson has the potential to break the traditional medical journal publishing model. He reads all the journals so doctors don’t have to. Will this spawn new subscription models?” Kohn acknowledged, but didn’t answer that intellectual property question since it was not in IBM’s business plans to disrupt the publishing industry. He agreed that Watson helps overcome the flaw of availability (you only consider the options that come to mind) and the flaw of self-reinforcing bias (an enlarged ego can cloud decision-making) by taking in all of the published medical literature.
Wait, what was that? I stopped dreaming and looked up. Another question from the group: “You say that Watson will take in all published studies – what about unpublished studies, such as those submitted to the FDA during a drug-approval process?” And another: “What about the medical knowledge not captured in published journal articles – will Watson be equipped to take in that information?” And another: “You say that Memorial Sloan-Kettering is your first client. Their clinicians do not tell kidney cancer patients about a certain treatment – will their version of Watson be taught this bias?”
Uh-oh. The answers to these questions – and others – whipsawed me between hope and disappointment. Hope that Watson is potentially what we’ve been waiting for – a learning system to assist medical decision-making. Disappointment that Watson is potentially reinforcing the traditional model of “doctor knows best,” instead of the new, participatory medicine model of “doctor knows a lot, but let’s work on this together.”
Here’s the thing about Health Foo: You can’t get through your slides without being interrupted by such questions. My sense is that the questions stem from curiosity and collegiality. We want you to succeed. We want to fall in love with you and your ideas. But we also want you to be the best you can be and will let you know if we see a flaw, like, um, an amazing tool that is going to be trained according to the old model of health care, not the emerging model.
And so, within the space of an hour, I fell in and out of love with Watson. Like many a jilted lover, I gathered with friends to hash out why I was so upset, even angry. I had set Watson on a pedestal, if only for about 15 minutes, and when reality came rushing in, I felt betrayed. Why couldn’t he be perfect? By the end of our conversation I had come back to a feeling of hope. Watson may not be perfect, but he’s on the path. Maybe he has a younger brother.
Susannah,
Thank you for writing so eloquently on the subject of IBM Watson. I missed Health Foo (sadly, not invited), but had the chance to attend a lunch meeting last week on the topic of Watson, where Dr. Kohn was the speaker.
I, too, see many exciting possibilities for Watson or Watson-like technology. As you point out, one of the short-comings of a decision support system that is built upon a pre-defined corpus of information is the quality and comprehensiveness of the corpus. For pilots, IBM has secured rights to use a variety of sources, but the dream of a widely-available service that draws on the same breadth of information will require some intense licensing efforts. Plus, the published literature is not comprehensive and is certainly not a time-efficient way of accessing the latest in medical research learnings.
On the bright side, Dr. Kohn did talk about the consumer portal, Ask Watson, that could be used in conjunction with EHRs and PHRs, and mentioned that incorporating patient preferences could be built into Watson. Maybe this was added to the presentation after Health Foo!
In any case, I think I would fall in love with a version of Watson that I could query directly.
Thanks, Janice!
Dr. Kohn did describe the consumer portal and it sounded very intriguing. The inclusion of EHR/PHR data carries its own opportunities and challenges, wouldn’t you say? I’m thinking, of course, of Dave’s experience when billing codes were ported over to his online PHR, with no dates attached:
Imagine someone had been managing your data, and then you looked.
http://pmedicine.org/epatients/archives/2009/04/imagine-if-someone-had-been-managing-your-data-and-then-you-looked.html
Wow! I stepped away for a 1.5 days and just returned to a long and interesting comment stream.
With respect to my comment about a patient portal, ideally I envision a tool for patients to query Watson directly to learn about the range of options (a la Dave’s vision to include social media content, current findings described at conferences, and better yet, analysis of real-world outcomes data from EHR repositories) all personalized according to the patient’s preferences (e.g, minimize side effects that would prevent me from ….) and known genetic–or other biomarker data. Isn’t access to the same body of research that the doctor consults key to “shared decision making”?
This time, I’ll tick the “Notify me” box below so that I see follow-on comments. Your posts always attract a diverse group of commenters, although I just realized that they’re all men except for you & me!
Hi Janice. The possibilities of IBM Watson are many. And while the current focus is putting it in the hands of medical professionals, direct patient access is certainly a possibility for the future.
Be careful what you wish for regarding email notifications – sometimes the discussions go on for weeks on e-patients.net!
I will gently re-introduce a few questions that were brought up at Health Foo and are simmering here: What is the business model? Who are the clients? How does that affect Watson’s configuration (to use Dave’s word)?
It’s not at all surprising that a corporation would target hospitals and large health systems as clients, knowing that they will have the resources to pay for a project of Watson’s magnitude.
But I’ll raise again the question of practice model, which (I think) is different and separate from business model. What we hear on this blog (and elsewhere) is that the medical knowledge base is imperfect and other sources, such as expert patient observations and insights, can improve it.
Many people may wish that we could skip over the “crazy, crazy, crazy” stage to get to “obvious” but why not shine the light on the path so everyone can see it? (see this post for more on that theme).
Michael, in case there’s any doubt, it’s fine with me for Watson to be primarily marketed to providers for now – y’all are doing a massive project, and I think it’s prudent to pick your projects sensibly.
The thing that really concerns me, as I said before, is the inherent error of using it to support a flawed model of what’s the best information. I know that’s always a judgment call (a matter of opinion) but I urge y’all to be thinking (and back-room developing) ways of harvesting fresher information.
In fact that by itself ought to be a valuable add-on service.
This April, 2012, Wall Street Journal article by Amy Dockser Marcus provides another angle on what we’re talking about when we talk about a new practice model:
Patients as Partners
An online network for sufferers of inflammatory bowel disease provides some clues to the power of collaboration
http://online.wsj.com/article/SB10001424052702304692804577281463879153408.html
Thanks again Dave and Susannah.
Hopefully I can answer your questions. Before doing so, I’ll say that while IBM does have a good number of MD’s on staff, medical expertise is not IBM’s core competency. So we look for that expertise from other organizations and individuals.
1) What is the business model? While IBM Watson is currently in pilot mode, IBM intends to deliver Watson as a service with value-based pricing. That is, IBM receives a negotiated fee based on the incremental value delivered as a result of the IBM Watson solution.
2) Who are the clients?
Right now, IBM has HC pilots with WellPoint and Memorial Sloan Kettering and in finance with Citigroup and an institutional bank. In the future, IBM intends to expand into additional industries and use cases and at least initially, will be focusing on professionals as the primary direct users. Of course, it is hard to predict what the future holds and consumers may some day be direct users as well. Watson is best suited for data-intensive industries that:
**Require the analysis of a high volumes of both structured and unstructured data
**Benefit from the speed and accuracy of a response to a question or input provided
**Desire to systematically learn with every outcome, action taken, and iteration
**Have critical questions that require decision support with prioritized recommendations and evidence
3) How does that affect Watson’s configuration?
There are aspects that remain constant from one implementation to another while there are other that vary. Platform-level components of Watson’s technology like its natural language processing abilities, machine learning models, question analysis, and hypothesis generation remain constant. In contrast, aspects that differ from one implementation to another include the body of data that is assembled, the user interface of the Watson-enabled application, and the training, which Watson undergoes in preparation for a specific use case.
Dave – to your point about data… yes, it’s true that no single source of data is perfect. Medical text books are imperfect. Peer-reviewed journals are imperfect. Expert patient communities are imperfect. Ultimately, the decision of what to include in a given data corpus for a specific use case and implementation is that of the client. There is no technical reason why crowd-sourced data contributed by individuals could not be added to a data corpus and there may come a day when that happens. Through its machine learning capabilities, Watson can gradually lean to weigh data derived from a given source higher or lower based on its reliability in contributing to a successful outcome.
Susannah,
I LOVED your love-and-loss story about Watson. As you’re probably aware there have been many efforts to use computing power to assist doctors in the last three or four decades: “expert systems,” “decision support” systems, and “knowledge management” processes. All have gone through the hype-cycle of initial enthusiasm followed by disillusionment. That’s not to say there has been no progress; Watson is no doubt an advance in language interface and in exceedingly speedy “big data” processing. How useful it is in the long run remains to be seen. Your learned ambivalence about Watson will serve you well in evaluating it and other similar systems. In my opinion, a big nut of skepticism is warranted in judging these products.
Jannice McCallum put her finger right on the most important issue: the quality of the information that goes into Watson. After years of work IBM refined the Jeopardy database and processing algorithms to an amazing degree, but that doesn’t necessarily transfer over to great assistance to physicians and medical researchers. I suspect they’ll be devoting tremendous resources to the Sloan-Kettering project, most likely paid for by IBM. This is very much another demonstration project with more breakthroughs needed in Watson’s algorithms. But what’s the quality of in data input? The oldest aphorism in IT is: “Garbage in, garbage out.” What factors will influence the input? What are the criteria for evaluating the sources? Can they really get a comprehensive database of medical information, or will the totality be constrained by copyrights or the cost of licensing the data? How will the biases of the physician and researcher bureaucracy at Sloan-Kettering affect the project? What are the assumptions that go into the algorithms for calculating the probabilities and how transparent will that be?
We must also keep in mind that Watson is a business product of IBM to be sold to other very well-funded health industry institutions such as big hospital chains, research centers, insurance companies, Big Pharma and others. It’s no wonder, Susannah, that you detected that it’s going to fit hand-in-glove with the traditional medical industry because they’re the ones who have the money for Watsons. That’s the market, and Watson MD will serve those needs. (And Watson Esq will eventually serve the law business.)
Another thing to watch is how the marketing pitches go. Mr Kohn (interesting name for a marketer) was quick to toss out the wow-factor numbers about Watson plowing through 72,000 hours of reading per day while your MD only does five per month. The appropriate response to that is, “So what?” Engineers and suede-shoe salesmen love to dazzle laymen with figures like that. “Gee, Watson sure must be smart!” Really? Speed only means something if a whole list of other criteria are well met.
Well, I guess I’ve revealed that in the many years I’ve followed AI and other gee-whiz technologies and watched more than a few of them peter out I’ve gone past disappointment to cynicism. What I’d like to see is IBM fund three Watsons: Sloan-Kettering Watson, MD Anderson Watson, UCSF Watson. Have three totally independent, non-collaborating teams create Dr Watsons. Then have the three engage (head-to-head?) in a Jeopardy-like competition in medicine. Let’s see if they come up with the same answers. If they do then we’ve probably really learned something. If not, why not? Perhaps just as interesting would be to see if a panel of top human “experts” in oncology and other specialties could agree on the answers.
While I’ve become a wait-and-see judge of these things, I have to agree, Susannah, that Watson is a step in the right direction. But I think a lot more steps are needed. And in the end I think participation by us folks will be as necessary as ever. We’re going to have to judge whether we trust Dr Jones or Dr Watson to be our mediator. Which Watson would you trust more: the one trained at Stanford Med School (maybe with a degree hanging on its cabinet) or the one that matriculated at, say, Johns Hopkins? I suspect this effort is going to reveal that a great deal of uncertainty remains in the science of medicine and that’s something we all have to live with. A list of weighted choices is the best we’ll get, and our Watson may give a list next year somewhat different from this year. We’ll still have to devote a substantial part of our lives to understanding ourselves as unique individuals in our own context. Watsons will be heavy-duty tools for information plowing, but the decisions–and the uncertainty–are still mine.
Wow, what if your idea for a Watson bake-off actually happened? I would love to see that.
It sounds like you don’t need any further evidence or arguments in favor of a more thoughtful approach to AI, but just in case you haven’t read Diana Forsythe’s book:
Studying Those Who Study Us
An Anthropologist in the World of Artificial Intelligence
http://www.sup.org/book.cgi?book_id=%204203
My favorite line of hers:
“Whose assumptions and whose point of view are inscribed in the design of this tool?”
She urged, “Design for what could be.”
Indeed.
> “Whose assumptions and whose point of view
> are inscribed in the design of this tool?”
Or, now we can ask it with “…in the *configuration” of this *installation* of the tool?”
> She urged, “Design for what could be.”
Do you think Watson *is* designed for what could be? Do we know where the design leaves off and the configuration takes over?
First, Susannah, I just frickin LOVE the way you write. Your playful mastery of apt metaphors is like riding a great roller coaster (or something) while listening to a story, with it all coming to the end at the same time and place. What fun.
NOW. === Screed alert ===
Maybe it’s because I first did some programming (punched paper tape on a Univac desktop box) in 1966, at the U of Minnesota’s summer program for high school geeks. Maybe it’s because last weekend’s college reunion reminds me that in college I lived with a slew of mega-programmers: I was once thrown in the shower by Paul Mockapetris, who a few years later authored the SMTP spec, and at the other end of the “Paul animal” spectrum, I lived down the hall from Paul Karger, one of the original Multics geeks (and I do mean geek).
For those who don’t know, Multics led to Unix: “Unix is one of whatever Multics is several of.” Karger was an extreme thinker, very out of the ordinary in every way. (Read his obituary.)
In any case, as long as I’ve known computers, I’ve known that they’re precisely and profoundly limited by the thinking and choices of their designers, and those choices are inherently NOT UNlimited. So it’s ALL about what those choices were, what they might have been, and what you end up with.
When I first learned FORTRAN, I learned that it’s a FORmula TRANslation language, designed to make it easy to program scientific equations. I never learned COBOL, but I knew it was a COmmon Business Oriented Language, without so much mathy equationy stuff, but designed to do businessy stuff really well, with easier programming.
Etc etc.
What I loved about Watson before FOO was that its learning would be endless: it could find stuff that the user (supposedly a doc, in this case) never imagined, even stuff that was just published yesterday.
That’s why, in the FOO session, I instantly jumped on the “selection bias” inherent in letting a prejudiced owner put its personal blinders on the computer. Ow ow ow, wrong-headed. Perpetuation of prejudice. Exactly the same as if a paternalistic, well-meaning Big Daddy were to cull out what volumes would be exhibited in his town’s new library, so none o’ that perverse “Black” literature (or Commyanist literature) would pollute people’s minds.
All such filtering is done with the best intentions, of course. That’s paternalism in a nutshell.
The other point I raised at FOO was what geeks call information latency: the time delay between when new information comes into being and the time it’s been published. Doc Tom Ferguson called it “the lethal lag time,” and the way I understand Sloan Kettering is using the system (limited to reviewing the literature), the delay can indeed be lethal: As I told Dr. Kohn, the 2-5 year publication delay is a bitch of a problem if your median survival at diagnosis (like mine) is 24 weeks.
Watson could be the miracle messenger here, delivering the latest, like a daily newspaper! But not if it’s stopped from doing that, by being told to only look at the literature – and only mention treatments the local doctors recommend.
(And that’s not to mention what Ben Goldacre talked about at TEDMED – “the missing data” that does exist for clinical trials that never make it into the literature; he called that missing data the cancer at the core of evidence-based medicine. We want to formally institutionalize that??)
Note that all this isn’t anti-doctor — to the contrary, no front-line clinician can possibly do his/her best without the latest info. No; all this is about using Watson to do what it does great: bring together all the information; cure latency; bring unrealized life-saving information to the point where the patient – and the clinician – need it.
The REAL miracle-worker “Watson: Come Here, I Need You” edition will keep up on the latest scientific conferences, blogs, and social media. And he’ll keep tabs on what good patient communities are talking about. Imagine, for kidney cancer, being told about the full range of treatment options, AND that the ACOR patient community found a new side effect affecting Sutent patients, AND that in Professor Zingberwaddle’s presentation at ASCO last week, slide 14 had new information on outcomes for patients with a particular histology.
In my uninformed suspicion there’s no reason that can’t be done today. Last year I read that “Jeopardy Watson” reads all the newspapers every day (or some such). But if I understand correctly (I might not), “Sloan Kettering Watson” has been told to knock it off. That sounds dumb, to me. :-)
==========
For those who don’t intimately know the Gartner Hype Cycle, I encourage you to sidle up to it, and ask it if it’s heard recently from Cousin Watson, who seems to be sliding over the peak, into the trough. :-) See the picture, and roll over the sections to get the descriptions. The trick is always to reach Enlightenment.
I had fun writing this post and am very glad people like it (so far – critiques also welcome).
What I hope is that it will incite arguments on all sides — history lessons, ethical debates, spirited defenses. As Alexandra Drane recently wrote, we need to be intellectually brave “with a good dash of humility, curiosity, and chest-out-to-the-tape passion thrown in!!”
I came out of the Watson presentation with a simple opinion: what a wasted opportunity!
Watson now strongly appear as great technology used with an absolute lack of social innovation associated with it.
The lack of courage is apparent: the technology will not be used to push forward easier access to knowledge by those who need it most. In fact, it was pretty clear from the HealthFoo Watson presentation, that it is a marketing tool designed to please and impress the professionals who thrive in the paternalistic healthcare system we so strongly despise.
I am less than impressed with the lack of vision, when you consider how easily such technology could help transform this nation into a nation of engaged and well-informed patients.
Well, Gilles, as my comment above suggests, I’m hoping that’s just a config issue …. and that dialog like this (and their participation at FOO) will provide the feedback to point the thing in the right direction.
We shall see.
I enjoyed watching Watson play Jeopardy, and was intrigued by press announcements shortly thereafter suggesting its next big hurdle would be health care.
My history with computing does not go back quite as far as Dave’s, but as a philosophy major in college, artificial intelligence is what initially drew me into the field of computer science. I was – and am – fascinated to learn more about how we think and learn.
My expectations of Watson are tempered by the experience of Mycin, an expert system developed in the early 70s by Edward Shortliffe (who continues to be active at the intersection of health care and computing), which exhibited performance superior to human experts, but was not widely adopted – outside of educational contexts – due to other, more institutional, factors.
While the advisory context in which Watson is currently being pitched seems like it might overcome some of the earlier institutional hurdles, having listened to a series of segments in the most recent NPR OnTheMedia episode which included segments on the undermining of science journals and rise of retractions in such journals, I wonder whether digesting all the information in medical journals would lead to better or worse diagnoses.
It’s very interesting to read this post and the responses to it. When I blogged about Watson’s first steps into healthcare last September, the concerns I observed in my reading were almost entirely from doctors. They objected to a machine compromising the “art” of practicing medicine. Talk about paternalism! Frankly it bothered me quite a bit, because quicker and better information delivery will, to my mind, lead to better care.
Given that background, the objections here caught me somewhat by surprise. “Garbage in, garbage out?” It’s the same curated “garbage” that human doctors currently ingest, only Watson can take in vastly larger quantities. So what knowledge do our doctors possess these days? Are they only capable of sputtering out a few dust bunnies’ worth of data as opposed to Watson’s mountains of garbage?
The disparity of the viewpoints makes me pessimistic about possible reconciliation. Watson could be a big step forward for the current system, but it is being resisted by many within the system itself. As for leveraging it to change the system, well, I can imagine the MD outcry to enabling the incorporation of uncurated data from a consumer portal. How do you develop an AI to sort through the large amounts of true garbage (and you know it’s out there–snake oil salespeople, hypochondriac rantings, inane associations, etc.) to extract the valuable knowledge that does exist in blogs, commentaries, patient communities, and elsewhere? That sort of thing can only happen if doctors and patients work together toward a common goal, which isn’t going to happen under the current regime.
So often lately I come to the same conclusion: There are so many healthcare advances in the works from which we will not benefit unless there’s some sort of Topol-esque creative destruction. Only within a new context can Watson achieve its potential, which I still think is formidable.
Hi Mark! Please post a link to your post last fall.
I don’t see the disparity of viewpoints as pessimistic, personally. The transformation we’re talking about it VERY large, and formidable in its magnitude and encrustedness. Every viewpoint I’ve seen here is valid.
Re filtering out the garbage: three years ago we had a robust discussion about MedPedia, which attempted (erroneously, I believe) to solve this by limiting its editors to people with specific credentials. As the discussion there showed, that standard had already led to substantial errors in their content – because behind the facade of reliable academia, there’s actually no vetting at all, no retrospective checking for garbage that slipped through in credentialed people’s words, and no process for reporting and fixing errors.
That points the flashlight directly onto a culture that’s in denial about its own human fallibility. And of course denial disables improvement.
Anyway, check out the MedPedia post and comments – I ended up suggesting Amazon-style thumbs-up/down ratings, crowd-sourced … with separate tracks for consumers and clinicians / expert patients (those who know the field and the lingo).
A great thing about the Amazon system is that the reviewers themselves get rated!
Most ironic of all, IMO, is that in the early days of the Web people said “I’m not buying online – I don’t know who it is,” but today we have online reputation systems, and on eBay or Amazon it’s harder to sell without a good reputation among the community. Fascinating.
I wonder if Watson could learn from what the community has said is high quality!
Thanks Dave,
I tried several things while writing my response and couldn’t make a nice hyperlink like yours–maybe my Firefox is too ancient? Anyway, here’s the ugly URL link:
http://community.jax.org/genetics_health/b/weblog/archive/2011/09/21/from-jeopardy-to-healthcare-watson-s-medical-promises-and-pitfalls.aspx
My blog focuses on genomic medicine mostly but has also addressed some of the many facets (and frustrations) involved with implementing significant change into our ossified healthcare system.
I’m very happy to report that @IBMWatson tweeted the following today:
“Great article! #IBMWatson hosts continuous ‘bake-offs’ with recursive feedback loops to learn & improve from experience”
And:
“Love the article. One point: there is no limit to the data #IBMWatson can ingest; users pick the sources”
I invited them to join us here, promising a friendly community, so um, be friendly, OK?
Hi all. Michael Holmes here. Program Director in the IBM Watson Solutions group. I sometimes use the @IBMWatson twitter ID. Thank you, Susannah, for a great article. I have a few thoughts to add to this engaging and constructive discussion.
1 – IBM Watson has only just begun to take the first baby steps in a multi-decade journey. So, many statements made about Watson made by myself, IBM, our pilot partners, or anyone else are forward-looking. It’s an exciting time in our industry in part because it is hard to anticipate the specifics of how Watson will be used and what impact it will have.
2- IBM Watson can continuously ingest and analyze data (both structured and unstructured) and discover new patterns and insights in a matter of seconds. The data sources selected for Watson to ingest are up to the organizations that use it. So if users want to include things like unpublished studies, that’s fine. However, unlike traditional systems, Watson learns and improves with every iteration through recursive, virtuous feedback cycles. It does not rely on business rules, or decision trees. So it considers the outcomes and balances the weighting of various data sources higher or lower to give better advice in the future.
3 – A unique combination of three capabilities set IBM Watson apart from other systems:
Natural language processing to help understand the complexities of human speech and writing as both a means of user interaction and as sources of data
• Hypothesis generation and evaluation by applying advanced analytics to weight and evaluate a panel of responses based on only relevant evidence
• Evidence-based learning based on outcomes to get smarter with each iteration and interaction
Welcome! Thanks for joining in the discussion, which is growing more interesting by the minute.
Your first point captures why I fell for Watson and eventually came back to feeling hopeful for the future – your team’s acknowledgment of the iterative development process.
I have continually found in my own research (see: pewinternet.org) that we can’t predict how people will adopt and use technology. The best we can do is to focus on the present, measure current realities, and constantly check our assumptions to be sure they are not obscuring possibilities.
Would you be open to a collaboration with an expert patient community? One that guards against misinformation and has scads evidence and resources? If so, this community can help you connect with one or more of those networks.
Thank you very much for the generous offer! IBM works closely with users in everything it develops and IBM Watson is certainly no exception. At this point we have been collaborating heavily with healthcare professionals (primarily at Memorial Sloan Kettering and within the WellPoint network) but patients are absolutely part of the plan. I will work with our development team to figure out when will be an ideal time to engage. And I would love any thoughts from any of you on any specific areas in which you would like to collaborate. Thanks once again for the great conversation and generous offer.
Two other tweets worth sharing before they scroll away into oblivion:
“So Sci-Fi! | @SusannahFox falls in love with a robot!” – @drsteventucker
“Maybe you need Watson’s younger sister?” – @Dr_som
That second one made me laugh out loud – why limit ourselves to male-pattern AI? Who’s ready to step up and describe how a female version of Watson would be different?
Excellent post, as always.
Though sounds like you fell out of love with Watson’s controlling extended family (ie, the established medical system), rather than insurmountable flaws in the young wonderkind.
Yes! I guess I am still holding a torch for Watson. But those in-laws – ugh ;)
Magical mathematics determined by philosophy, and myth proved by successful use since 1985. The VoiceCommand which today is simply voice command, either way it is first listening interface based on “dynamic time warping” linear predictive code to satisfy algorithmically the statement, “interactive question specific response technology” using human voiced (puffs of air) speech sounds, which are treated in such patterns right and left for the best match to the input signal. The best match is shown on the computer screen, with all executables initiated. VoiceCommand allows a 1,000 executable keyboard commands to be programmed within a 1.6 m/sec time frame. Single word or 4/6 word phrase in English. Phrase use improves semantic understanding, verses single word arranged in a complete sentence for meaning; subject, noun, pronoun, adjective, vowel, verb, adverb, and that’s the short list obviously. Emotions of voiced human, and many other species sounds travel great distances with no loss of meaning because water conducts audible sound including noise best.
All of the details of my use are based upon actual experience. Many, many tests were required to master the capability voice command voice recognition had ,and still has, and depended upon the required voice command pre-programmed features. Apple announced the voice command connection to their “Siri” interactive question specific response algorithm. IBM’s Watson artificial intelligent interactive question specific response technology using 15 separate languages. Language switching on command. Speak in one, then answer in another or multiple out put replies. All rely or response on their individual word or phrase utterance is processed in cloud super computers, wireless obviously.
Knowing something important would have to sometimes wait for the correct introduction. Single detailed knowledge then used by the masses using two words, voice command has been announced today 06/11/2012 by Apple, President, Mr. Cook using the algorithmic “dynamic time warping”, “linear predictive pattern matching mathematically to the right then left looking for the best match. The in-put (is after listening) signal is encoded into a digital (A/D converter) equivalence signal based upon the frequency, loudness, type of microphone (omni verses uni directional), emotions (there are five possibilities at least), 1985 was a good year for learning while selling, and installing VoiceCommand and Vocalink which were versions of the same voice recognition program. Three programs used sequencially with up to five menu’s active with the capability of context switching across all five. Each up to 1.6 m/sec in spoken (puffed, shaped by the human spoken vocal length that can then execute up 1,000 commands, across all five menu’s, including off site access of cloud wireless technology. Command control, voice-to-text, and speaker independence, including audio response mathematically determined by complex Stochastic modelling, to HMM data collection, analysis, to the final best answer. This is a “learning” algorithm. It is improving itself automatically, behind the visible listening.
Magical mathematics determined by philosophy, and myth proved by successful use since 1985. The VoiceCommand which today is simply voice command, either way it is first listening interface based on “dynamic time warping” linear predictive code to satisfy algorithmically the statement, “interactive question specific response technology” using human voiced (puffs of air) speech sounds, which are treated in such patterns right and left for the best match to the input signal. The best match is shown on the computer screen, with all executables initiated. VoiceCommand allows a 1,000 executable keyboard commands to be programmed within a 1.6 m/sec time frame. Single word or 4/6 word phrase in English. Phrase use improves semantic understanding, verses single word arranged in a complete sentence for meaning; subject, noun, pronoun, adjective, vowel, verb, adverb, and that’s the short list obviously. Emotions of voiced human, and many other species sounds travel great distances with no loss of meaning because water conducts audible sound including noise best.
All of the details of my use are based upon actual experience. Many, many tests were required to master the capability of voice command voice recognition had ,and still has, and depended upon the required voice command pre-programmed features, and the skill of the user in 1985. Today voice command is “turn key”, and a “natural language” almost “conversant” listening artificial intelligence Apple voice command connection to their “Siri” interactive question specific response algorithm. IBM’s Watson artificial intelligent interactive question specific response technology using 15 separate languages. Language switching on command? Speak in one, then answer in another or multiple out-put replies. All rely or response on their individual word or phrase is processed in cloud super computer technology, wireless-ly. “Siri”, Watson, all car manufacturer’s, Smart voice recognition television, and other device control has exploded.
Hi Jack. Just a point of clarification. IBM Watson and Siri are designed to solve very different problems.
• Siri’s natural language capabilities focus on input/output by providing a speech enabled front end and application integration for a fairly limited set of consumer-focused needs (e.g. sending a text message, activation of mobile device functions, directions to a restaurant, etc.). When questions expand beyond the limited domain, Siri simply offers to do a web search. In contrast, Watson’s natural language capabilities span broad professional-level domains from an input/output perspective but also encompass analysis and understanding of natural language as a source of input for its responses rather than relying solely on traditional structured data
• Siri is designed around single question / single response interaction while Watson provides a confidence-scored, evidence-based panel of responses on an iterative case-basis as well as the evidence supporting each response
• Siri improves based on software code updates performed by the manufacturer while Watson improves with every iteration through recursive, virtuous feedback cycles that compare its responses to actual outcomes and adjust algorithms for better guidance in future situations
This love in and out regarding Watson is extremely anthropomorphic, and fussy, more later.
Breaking .. Med students as threatened by Watson as Docs in practice. Facing a strong headwind here…
Bob, I was wondering when (or whether) med student reaction was going to show up here! So thanks, and why not invite some of ’em to dive in here??
Of course I’m not a medico but here’s my first-draft view:
1. “All knowledge is in constant beta.” Gilles Frydman of ACOR said that two years ago, and I quoted him in my next keynote. We all (especially med students) learned in science that everything we think we know is subject to revision when new info comes along.
SOMEhow, a large subset of scientists have forgotten this, and have taken on an unworthy arrogance about “I know and you don’t, and it’s going to BE that way, okay??”
2. “The useful half-life of a medical education is about four years.” I heard that at a conference recently; I’ll try to find the original cite, but I may not be able to. Maybe it’s 6 years, or 8, but the bottom line is the same: half of what you learned in school is obsolete a few years into the future.
Not gross anatomy, of course! (I HOPE not.) But treatment options, etc.
So I’ll ask the docs here: what % of what you use every month is the same as when you got out of school?
3. That raises the question: whose responsibility is it to keep up on everything? Should clinicians kill themselves with reading and newsletters, or should a trained robot do that, on demand?
4. For me, at the end of that rabbit hole is a profound realization:
Nothing says this better, IMO, than Susannah Fox’s work at Pew Research. In last year’s report “Peer-to-Peer Healthcare” she reported that “In the moment of need, most people turn to a trusted professional,” and in the next section, “People turn to different sources for different kinds of information.”
I’d LOVE it if you could share that whole paper with your students and see what they think.
The truth, as far as I can tell, is that healthcare is “dis-integrating” – previously bundled pieces are coming apart and becoming available separately. There’s no question in my mind that no matter what info is available to the public, I’m goin’ to my DOCTOR to get problems solved.
Good points.
FWIW, the machine learning community has a term to characterize the problem of shifts in a knowledge base: concept drift. I don’t know how well Watson can or will handle this, but it would be essential for success.
As for the value of the “trained mind”, I would highly recommend reading Jonah Lehrer’s review of Daniel Kahneman’s work on cognitive biases, published yesterday in the New Yorker, using the provocative title Why Smart People are Stupid, which describes issues impacting the performance of “experts” relating to blind spot bias, overconfidence, extreme predictions, and the planning fallacy (among others).
Thanks for your comments, Dave. I hold them in very high regard (as I do you), and must bring them and this wonderful post by Susannah to my students’ attention. Also, I’ve now downloaded Susannah’s “Peer to Peer Healthcare” report for posting on my PM101 course website. Excellent recommendation!
Additionally, the follow up reply to your comment by Joe McCarthy, regarding “Why Smart People are Stupid”, is the perfect finishing touch to everything you’ve astutely mentioned.
Hi Dave. We recently posted a short video that speaks to this among other topics here that is worth taking a look at. It’s called ‘voice of the physician” and it’s filmed at the HIMSS event and primarily features members of our HC board of advisors
http://www.youtube.com/watch?v=63xe8gXODEM&feature=plcp
In brief, IBM Watson is an information resource. It does not make decisions or take action. Doctors, nurses, medical students, and other HC pros do that.
Hi Bob. Thanks for your post. We recently created a short video called “Voice of the physician” that discusses this among other topics. Link below. In short, IBM Watson will have many use cases in the future. But the one we’re working on hardest now is to help medical professionals
How is IBM getting permission for Watson to mine the medical literature? Negotiating with each individual publisher of relevant journals?
To me, Watson highlights what I believe is one of the largest problems in scholarly publishing, which is that it is very difficult or impossible for researchers to mine the full corpus of academic research. While I imagine it would ultimately be rather trivial for IBM to negotiate such rights, it is a massive barrier for individual researchers.
The Guardian highlighted this very issue in an article last month: http://www.guardian.co.uk/science/2012/may/23/text-mining-research-tool-forbidden
And even more recently, a Guardian article last week reported that Open access to research is inevitable, says Nature editor-in-chief.
Thanks for your note, Nick. You’re right: it is sometimes easier for institutions to obtain access to medical lit. Other times, it’s harder depending on what the organization intends to do with the material. But to answer your question directly, selection of specific information entered into a given data corpus is dependent on client needs but there are three general sources of information:
**Public content (e.g. news articles, social media)
**Industry specific information (e.g. medical journals and text books)
**Situation / client specific data (e.g. data owned by the client, in this case the hospital)
Michael, thanks for your response. It’s great to have you here answering questions personally.
My question is more about the permissions involved rather than the types of information given to Watson. In cases where you’re dealing with industry-specific information, like in your work at Memorial Sloan Kettering, do you negotiate rights to mine medical journal content from each relevant publisher (e.g. Springer, Elsevier, Taylor & Francis, Wiley, etc)? If so, do those rights only apply to the subset of journals that the institution into which Watson is installed has subscriptions — that is, can Memorial Sloan Kettering’s Watson installation only mine medical journals to which MSK is currently subscribed?
Hi Nick. I’m very glad to be here. Right now, IBM Watson is in pilot mode with WellPoint and Memorial Sloan Kettering. So broader, production-level deployments may differ. But for the current pilots, IBM has negotiated directly with content providers without involving the licenses of our pilot partners.
I have to issue a Mea Culpa. I should not have assumed that IBM was trying to solve the issues that are central to the growing ranks of expert patients. Michael Holmes is doing a fantastic job in explaining what Watson is and is not. Thank you!
Since it is now clear that in its present iteration Watson output is limited by what the commercial clients use as input, the issue of bias seems unresolved. Michael do you think that Watson, in its current version, will minimize or, on the contrary, amplify the inherent biases of each client?
Thank you for your kind words, Gilles. Rather than speculating on affects of Watson on client biases, I CAN say that IBM’s clients (and IBM Watson clients in particular) tend to be forward-thinking and open-minded with respect to the place of technology in improving the outcomes of their patients. If there was one data corpus that included data contributed by patients and another data corpus that was identical in every way except it excluded such data, it would be possible to do side-by side tests that demonstrate the effectiveness of addition of this data. If inclusion of the data improved results I would suspect that IBM Watson clients would be receptive to using it. The idea of building a crowd-sourced data corpus is certainly an interesting one. It sounds as though participants in the e-patients community would be interested in contributing. Do you think there would be an appetite to do so from within the public at large?
By coincidence, on the heels of the wide-ranging discussion above of Watson applied to health care, the July issue of Scientific American has an article titled: “Machines That Think for Themselves” by Yaser S Abu-Mostafa, a computer science professor at Cal Tech.
For those wanting to know more about the computer techniques behind Watson and the current buzz around “big data,” I recommend picking up the issue (or online with subscription) and carefully reading the article. It’s a brief primer on a collection of processes called “machine learning”: supervised learning, reinforcement, and unsupervised training. Abu-Mostafa also does a decent job of identifying conditions when too much computing power and complex models can lead to significant errors and misinterpretation of the outcomes. LIke humans, sometimes computer algorithms can dredge up patterns that are meaningless.
Machine learning is not appropriate in all situations. I think if you read between the lines you’ll realize that machine learning is definitely work-in-progress. At this point it’s part science and part craft. That’s why I really appreciate Michael Holmes’ candid statement above that, “IBM Watson has only just begun to take the first baby steps in a multi-decade journey.” It’s a perspective to keep in mind when framing expectations for any of the AI/human interface products soon to be upon us.
Another reason I recommend this little article is that I think it is going to be increasingly critical for members of the medical community — and the public, for that matter — to have a basic grasp of what machine learning is and what it is not. I don’t think Watson-like systems mean the end of physicians by any stretch, but medical students may wish to include this in subject in their curriculum. Big data systems will most certainly be resources they’ll need to master. But how will physicians know whether or not to have confidence in the results of machine-learned diagnoses and actionable alternatives if they do not comprehend the process for arriving at output? How can they communicate confidence to patients about the information they have? How can they take professional responsibility for their treatment decisions if they got substantial input from something that is, to them, a black box?
Actually, large-scale machine processed information sources with much improved language query will be available to the public too. IBM may be ahead now, but Google and other companies are very active in AI and language interfaces.* Someone will take the technology public in the not too distant future. In fact, the public — or the proactive part of it — has a good chance to get out ahead of physicians in knowledge about machine learning. Free online courses, open to the public worldwide, are popping up like mushrooms. Last fall I enrolled in Stanford Engineering School’s online artificial intelligence course taught by Sebastian Thrun, the brains behind Google’s self-driving cars. I wasn’t alone; there were 160,000 enrolees. Now I hear there are similar free online courses coming out of MIT and Cal Tech, and Thrun has his own startup global online university, Udacity.
While most students in these courses are looking at a career in AI programming I suspect a significant number are people from other disciplines picking up knowledge in anticipation of AI entering their field or business. I think this is very healthy, perhaps even essential for professional competence. For instance, there are regulators in the UK and the US beginning to question the use of very powerful machine learning programs for high-speed trading in financial markets. There is reason to suspect that the executives at the top of some companies participating in trillion-dollar markets don’t know squat about how their trading programs work. Maybe even their programmers — often star physics graduates hired for their high level of mathematical sophistication — don’t fully know either.
Unfortunately, in my experience, the digital technology industry is often less than forthright in its representation of products. Most people know about the suits filed recently against Apple for misrepresenting the performance of the digital assistant, Siri, in its advertising. Consumers, more experienced than in the past, evidently are getting genuinely pissed-off and are calling-out untruthful marketing.
More to the point of this discussion, I was really dissapointed when I viewed the “Voice of the Physician” linked to in Michael Holmes’ comments above (http://www.youtube.com/watch?v=63xe8gXODEM&feature=plcp). Frankly, I went from incredulity to laughter while viewing it. In my opinion, some of the medical professionals in it need to get their feet back on the ground! There are folks quoted who seem to think Watson is going to “transform” health care in five years! Seriously? I quoted Mr Holmes at the beginning of this comment saying Watson is taking “baby-steps” in a “multi-decade journey,” but that’s not the message I got from “Voice of the Physician.”
I hope members of the patient participation movement will make the effort to inform themselves about the state of AI techniques so they can evaluate the claims of service providers clearly and help the public maintain realistic expectations. A boatload of high-tech stuff will be thrown at us to snag a share of the multi-trillion-dollar health care market ahead. After decades of sometimes being held at arm’s length by physicians through their aura of expertise and authority, the last thing patients need is another layer of mysterioso in the form of digital medical oracles to inhibit their self-empowerment.
* Just today an article from WIRED (http://bit.ly/Orfytv) proclaimed a Google neural network searched through 10 million YouTube thumbnails in 3 days and learned to identify…cats. With 74.8% accuracy. A paper on this feat will be presented at the International Conference on Machine Learning starting today in Edinburgh.
Good stuff, David. Thank you. Let me add to this comment: “But how will physicians know whether or not to have confidence in the results of machine-learned diagnoses and actionable alternatives if they do not comprehend the process for arriving at output? How can they communicate confidence to patients about the information they have? How can they take professional responsibility for their treatment decisions if they got substantial input from something that is, to them, a black box?” I would say that while MDs might benefit from a basic understanding of machine learning, that’s not nearly as valuable as an understanding of information behind a recommendation. IBM Watson is entirely transparent and not at all a black box. It will provide a panel of possible diagnoses with corresponding % confidence levels. BUT that’s not nearly enough. No MD or patient would trust a diagnosis or treatment on the basis of “because Watson said so”. At the push of a button, the documented, evidence-based sources supporting each possible diagnosis (i.e. medical journal, text book,… or expert patient group practice) that Watson suggests is provided. At another press of a button, users can drill down further to the paragraph level within each source for the ‘smoking gun’ behind each source of evidence that contributes to a recommendation. Then, users can get the same level of drill-down for each entry on a treatment advice panel.
Speaking for myself, I would describe widespread access to this kind of resource as transformative. But I would also call what IBM is doing now as a first step on a long journey. Right now, we are working with Memorial Sloan Kettering and WellPoint on a diagnosis and treatment advisor for lung, breast, and prostate cancer. One could mentally extrapolate to expansion into other forms of cancer and other diseases. Yes, this is a long way off from general purpose medical advice. How long will broader proliferation take? Time will tell. But regardless of the timing, I think it would be tough to argue that availability of this kind of resource is trivial.
Michael, thank you for the reply. It certainly reassures me to learn about the transparency that will be part of the Watson system and its practices. I would only hope for one more thing: as these implementations go forward I hope the spirit of openness affords the greater scientific and public communities the opportunity to see how machine learning works. The future credibility of such systems will depend on transparency not inhibited by proprietary restrictions. Certainly that would be a boon to health care. Also, there is potentially great value in the collaboration of medical researchers and physicians with computer scientists with substantial mathematical and algorithmic prowess. A lot of clarity could result although I think it’s going to be a lengthy process.
Call this: Watson: A Love Story Redux
An article on The Heath Care Blog by Dr. Davis Liu has set off a heated debate (is firestorm too strong?) over the future of AI in medicine that reminds me of the discussion above about IBM’s Watson’s use in medical institutions like Memorial Sloan Kettering.
At the recent Health Innovation Summit in SF the keynote speaker Vinod Khosla, a VC and longtime Silicon Valley entrepreneur, evidently said, among other provocative things: “Eighty percent of doctors could be replaced by machines.”
Yikes! The barbs are flying fast and furious over at THCB! http://thehealthcareblog.com/blog/2012/08/31/vinod-khosla-technology-will-replace-80-percent-of-docs/
There are signs of anger, fear, and defensiveness among the MDs and other health professionals along with some “get-up-to-date” admonishments. It’s fascinating to watch professionals duke it out. If you read it remember the fireworks are in the comments.
I take two ideas away from this discussion.
One: no matter what your profession or occupation, rapidly advancing technology will constantly challenge you to stay up-to-date during your career. No one is immune.
Two: there are many, many forces of change hammering the medical profession and the so-called health care system. Patient participation (not even mentioned in the THCB debate) is one among many forces destabilizing the future. #S4PM is by no means alone in trying to get movement, but he complex interactions of those forces make the future path of health quite unpredictable.
More on the ever evolving story of Watson. http://www.kurzweilai.net/paging-dr-watson-artificial-intelligence-as-a-prescription-for-health-care?utm_source=KurzweilAI+Daily+Newsletter&utm_campaign=2cbc5e4437-UA-946742-1&utm_medium=email
I think I can sneak this in under our AI topic. David Friedman’s article in The Atlantic, “Lies, Damned Lies, and Medical Science,” is pretty alarming (http://bit.ly/RKLkP6). He argues that medical science research (and perhaps science in general) is much less sound than we like to think. Read for yourself.
But this brings me back to Watson. I keep wondering what database is going to serve as “the Truth” for Watson’s machine learning. No AI can give better results than the quality of the data it’s fed when it’s “learning.” Kind of like kids. Again I say, from the beginning of IT the phrase “garbage in, garbage out” has been a truism. If Friedman is correct, how can Watson overcome that?
If IBM wants to do a real service before it sells a bunch of Watson systems to medical institutions it seems to me it’s going to have to help validate or clean up the medical literature, no small undertaking. Can algorithms run on a powerful computer help sort through the problems Friedman is talking about?
The validity of the data is what makes me seriously question the whole “Big Data” hype we’re hearing these days. Forty years ago when I was a graduate student studying sociology I fell in with a group of sociologists who were convinced that social data was so poor in its design and collection that it was virtually worthless. I dropped out and that insight left me a permanent skeptic about this sort of thing.
It’s not just garbage in-garbage out that worries me, it’s the unknown unknowns that aren’t in the databases at all, excluding rare diagnoses.
From a clinical practice perspective, I don’t think physicians and nurses need help with obvious diagnoses, but rather finding overlooked ones. As anyone with a rare condition can tell you, misdiagnosis is the default outcome of the current system—to add value, Watson and other AI/Big Data approaches need to address this unmet need, and I worry that they can’t.
For example, while researching our Big Data: Hype and Hope paper with DrBonnie360, I had a chance to study a SRII presentation on Watson’s provider support capabilities. The example shown was a middle-aged woman with recurring urinary tract infections. The possible diagnoses were ranked in the results and number one was…a urinary tract infection! This seemed so obvious as to insult any user. Renal failure, diabetes, influenza and hypokalemia were all listed as lower-probability diagnoses. But, because I was dealing with a recent diagnosis of bladder cancer in my 87YO mother, I knew that recurring UTIs are a hallmark of bladder cancer, at least in the elderly. So I wondered why in the Watson example, bladder cancer appeared nowhere on the differential Dx, especially since a family history of bladder cancer was listed among the inputs. Was this a poorly presented example, with incomplete information making it to the slide, or is this a real gap in Watson’s data model?
This example is scary because “recurring UTI suggests bladder cancer” should be in the database, but I know of other conditions (e.g., deep gluteal nerve entrapment) where there is next-to-nothing in the literature, and most of it in denial of the diagnosis.
Can any AI system point out where accumulated medical wisdom is wrong?
Ellen – just to clarify – Watson does not create new knowledge. It can not do anything that people can not do (given enough time). So, yes, it is dependent on the medical literature in the same way that flesh and blood medical professionals are when making evidence-based decisions. The UTI demo you saw is obsolete and was put together in the early days as an example of Watson’s iterative info analysis process rather than to showcase sophisticated medical specifics. It was for a general audience and other medical shortcomings have been brought up as well. A more accurate representation of what we’re working toward (at least in cancer diagnosis and treatment use cases) can be seen here in this unlisted video of some guy (me) walking through a demo at an event in Japan. http://www.youtube.com/watch?v=oGbbbS04e2Q
Michael: that’s a very interesting video. While it omits details of how Watson actually arrives at its confidence levels and which sources of data are structured vs. unstructured – understandable, given its length and [presumed] audience – it still provides a helpful illustration of a use case. Is it “unlisted” because of the level of polish typically required for IBM media materials, or because of a desire to restrict awareness of the use case? I’d be inclined to tweet a link to it, but don’t want to do so if it might restrict your future openness in participating in the comments on this blog post (which seem to have become a forum of sorts).
Joe – I put it there mainly as a means of training other people to deliver the demo. Within the next few days we will have completed a more polished and more complete version of this demo and will post it on the IBMWatson YouTube channel. So feel free to subscribe to be alerted of updates. I left it unlisted because it’s kind of rough. We’ve given this demo many dozens of times though so we’re certainly not trying to hide it. But please hold off on linking a tweet to it. Save that for the better version to be posted shortly. thanks
I imagine most people on this thread have already seen it, but FWIW, here is a link to what I believe is the more polished version of the video Michael told us would eventually be posted:
https://www.youtube.com/watch?v=HZsPc0h_mtM
It’s an 8-minute video showcasing the IBM Watson Oncology Treatment Advisor, posted February 8.
A must see video about AI and predictions about its arrival. http://fora.tv/2012/10/14/Stuart_Armstrong_How_Were_Predicting_AI
Watson might be trainable, but is still a toy. I would not trust a computer system that makes gross errors, like this, with any part of a diagnosis…
Sorin – dig a little deeper rather than jumping to conclusions. Watson expresses varying degrees of confidence in it’s response panel hypotheses. If it has incomplete info it has low confidence and asks a follow-up question. It would never have ‘buzzed in’ to that question with so low a confidence level but it was forced to answer since it was a final jeopardy question. Here’s a more complete look at the reasons behind that situation – http://asmarterplanet.com/blog/2011/02/watson-on-jeopardy-day-two-the-confusion-over-an-airport-clue.html
It’s an interesting coincidence that the “Watson” information system should come up on this thread again just as an article titled, “For a Second Opinion, Consult a Computer?” was published (Dec.3) in the NY Times . It’s a good article containing an expert diagnostician’s views on software systems, and it specifically mentions Watson. I think it has a pretty fair perspective.
What continues to bother me about the way Watson, Siri and other AI (whatever that is) products are marketed is the mindless anthropomorphization of what are simply hardware and algorithm-executing systems very cleverly implemented and lovingly fine-tuned by smart teams of people.
I’d think that hardware and software engineers would be the one who most thoroughly understand that Watson and other computer systems do not “understand” or are conscious of any of the words or questions that run in binary through their circuits. They are executing algorithms from the latest mathematical models of probability theory. They sift through carefully selected “training” masses of text and numbers to determine patterns and to calculate the most probable matches with input strings. They use carefully designed “success criteria” to further refine their calculation for the next time around.
But engineers frequently use language that equates these calculation processes with words like “understanding,” “learning,” “interpretation” and other terms that, to my mind, simply confuse the matter rather than help ordinary people really understand what is going on. Giving computing systems human names and using (I think often misusing) familiar laymen’s terms may help market IT products, but it does a disservice to users who urgently need accurate comprehension of what these systems do, and don’t do.
Computers are powerful tools that don’t need names. As with other tools, in the hands of a experienced user with deep understanding of the tool and insight into how it is best used, computing systems can produce wonderful results. But the best tools in the hands of naive, untutored users can produce terribly botched results. In the life and death matter of medicine patients — the ultimate “end users” — deserve everyone’s best efforts.
It’s no coincidence. I spotted the NYT story and tweeted a link to this post as another place for people to discuss the issues. The comments are now closed on the story (at 231!).
For those who haven’t read it yet:
http://www.nytimes.com/2012/12/04/health/quest-to-eliminate-diagnostic-lapses.html