Advertisement

Health Care AI: A Failure of Ambition

By on
Read more about author Alex Jennings.

Few fields are as aligned with technological development as medicine. It’s fair to say that medicine as a practice has been transformed by technology and now completely relies on it across all its facets, like drug development, medical diagnosis, and augmentation with prosthetic limbs. It’s been the source of new technology developments, such as MRI scanners, where doctors collaborate with scientists to create previously unimaginable devices.

Medicine feels like it’s supposed to be futuristic: Science fiction bombards us with a gleaming white future of technology-driven medicine where we will never need to feel the cold hands of a doctor on our abdomen, and probably even the dentists have laid down their drills. So it seems perfectly natural that mankind’s latest and greatest technology, artificial intelligence (AI), should be embedded in health care. 

How hard can it be? Those of us that tried to interact with a GP service in the lockdown could be forgiven for thinking the only tech needed to get most of the way would be a recording of a busy phone line alternated with a slightly frayed receptionist offering vague promises about appointments being available in a couple of months. (I’m teasing GPs in this blog post a little, which I figured is safe as I’m unlikely to meet one in person.) So, across modern health care, surely there’s huge scope for AI to help? People agree, and some of the world’s brightest minds coupled with some of the world’s deepest pockets have set about making this come true.

There has been a success. For example, medical imaging has been successfully assisted with machine learning techniques, medical record processing can be improved, and AI can even point the way to a new understanding of health – for example, it can accurately predict if a patient is going to die, though we do not know how. However, it has not been plain sailing. When asked to compete directly against humans in novel situations AI has been a failure; for example, during COVID, AI models did not help with the diagnosis or analysis despite much investment, and the transformation of front-line medical care with AI has seen some serious setbacks. 

Ambitions Thwarted

The specific problems the medical arena provides can be charted by investigating one of AI’s greatest successes, and the source of much of our angst about its potential superiority: the arena of games. 

IBM’s Deep Blue beat the world’s best chess player, Garry Kasparov, in a single game in 1996, and in a tournament in 1997 – the culmination of about 20 years of effort in developing chess AI. IBM then developed DeepQA architecture for natural language processing, which, in 2011 and now branded Watson, was able to crush the best human champions at Jeopardy – an advance that was thought to be the one that could allow it to compete and win in human technical fields. 

By 2012, IBM had targeted Watson, which was by then a combination of technologies they’d developed in the health care industry, especially oncology. 

Success looked inevitable: Press releases were positive, reviews showing progress vs. human doctors were published, and Watson could consume medical papers in a day that would take a human doctor 38 years. I made a bet with a doctor friend that by 2020 the world’s best oncologist would be a machine. 

I lost my bet, but not as comprehensively as IBM lost its big bet on health care. The initial pilot hospitals canceled their trials and Watson was shown to recommend unsafe cancer treatments. The program was essentially shuttered, with Watson pivoted to become the brand for IBM’s commercial analytics with the use of its natural language processing as an intelligent assistant. Today, IBM’s share price is 22% lower than at the point of the Jeopardy triumph. 

I’ve used IBM’s Watson to illustrate the difficulties here, but I could have picked failures with virtual GPs service,  diagnosticsor others. I’m sure organizations like these will succeed in the long run, but we can explore why some of these failures were likely.

To understand something of the scale of the challenge we can look all the way back to where the field started with the cyberneticists of the 1940s.

One cyberneticist, W. Ross Ashby, conceived several laws, one being his Law of Requisite Variety. This law should be better known, as it explains the root of all sorts of intractable problems in IT, from why large public sector IT projects tend not to go well, to why IT methodologies such as PRINCE II mostly don’t work, to why we should be very worried about our abilities to control super-intelligent AI. The law states that “only variety can control variety.” That is, if you have a system and you are trying to control it with another system, the control system must have at least as much complexity as the target system; else, it won’t be able to cope with all its outputs, and there will be an escape. 

In a game like chess, all the information needed to calculate the optimum outcome is included on the board – chess is hard, but the variety is not great. But in the world of front-line doctoring, there is incredible variety, and you need incredible complexity to supply the right outputs. This presents an immense challenge for AI: the real-world patients will be training material edge cases, but the AI would need to solve them effectively in one shot. We find they cannot, and escape is inevitable, such as the medical AI that agreed a patient should kill herself, one that was solving problems but was maybe racist, or one that was definitely racist. Could a future medic’s workday involve running the surgery, doing the admin, and checking if the AI assistant has had a racist incident? 

There is another problem in adopting AI into health care that probably has a technical name, but I will term it the “bus stop granny carnage problem.” If someone crashes their car into a bus stop and kills three beloved grannies, then it would be a big story on local news. If an autonomous car did the same, it would be a global news story, probably resulting in lawsuits and legislation. The point being we’re currently much more tolerant of human fallibility than we are of machine fallibility, and the bar for automated technology outcomes is, therefore, higher than it is for humans. This is somewhat rational, as a single human can only do so much harm, but AI will scale, and so mistakes would be replicated. 

Ultimately, these barriers make it extremely challenging to introduce AI into front-line care to replace humans. But that doesn’t necessarily matter, as health care AI can still provide huge transformational benefits.