false
Catalog
ASGE International Sampler (On-Demand) | 2024
Clinical Evaluation of ML Algorithms in Gastroente ...
Clinical Evaluation of ML Algorithms in Gastroenterology
Back to course
[Please upgrade your browser to play this video content]
Video Transcription
So we'll move on to discussing the clinical evaluation of machine learning algorithms in gastroenterology, and we will address this both from a research perspective and also just from a practical clinical perspective for our own practices as gastroenterologists, how we decide on whether or not to adopt these new tools. So we'll explore three different areas. The first is the current state of evidence for computer-aided polyp detection. That's really where the bulk of our evidence is. We'll ask the question of what evidence will we need to adopt computer-aided polyp diagnosis, and then a few ideas for practical considerations for evaluating new technologies in your own endoscopy suite. I think we first need to acknowledge that we're actually in a wonderful position in gastroenterology and that we lead all other medical fields in terms of high-quality randomized clinical trials evaluating AI tools. Hopefully we'll remain in this lead, but right now the field of gastroenterology really has a dominant position because of a lot of the work that has been done on polyp detection and polyp diagnosis specifically. Polyp detection really is, in fact, a gold standard for randomized controlled trial evidence across AI in all medical subspecialties. There are now nearly 10 randomized trials, all of which have shown some benefit for adenoma detection or other similar outcome measures. Across all of the available multiple randomized trials, we're seeing improvements in adenoma detection, although a couple of key points need to be made. Most of the benefit seems to be for really fairly small adenomas or diminutive adenomas. Some of this is just a statistical pattern insofar as that's the majority of adenomas that we identify and remove. Most colonoscopies, we may find a few small adenomas. You need larger trials to show benefit for much larger polyps, but we are paying attention in particular to data that will support sessile polyp detection in the right colon. We all know that that's a really critical need, and thus far the evidence for that is fairly thin, although beginning to grow. In particular, there are two recent U.S. and Europe multicenter tandem colonoscopy randomized controlled trials. This meant that patients underwent two colonoscopies with and without AI in random order. In both of these trials, the adenoma misrate was reduced substantially, but again, really the benefit was concentrated around very small polyps. One of the studies, the primarily U.S. study, showed a decreased misrate for sessile serrated lesions as well. This was the first study to show that trend, and hopefully we'll see that that trend is picked up in future studies as many of these AI softwares are trained on a higher number of sessile serrated lesions over time. We have to ask in our own practices, what will be the downstream effects of implementing AI for screening colonoscopy? Across all of the available trials with data that was pooled, it was clear that this resulted in a higher likelihood of intensive colonoscopy surveillance, both based on U.S. and European standards. Just to translate that very directly, it means that if you adopt computer-aided detection in your practice, it's likely that you're going to be recommending shorter colonoscopy intervals. That may have economic consequences for your business evaluation of a technology like this. We have to also acknowledge that while we refer to adenoma detection rate as our gold standard quality measure, this is really a surrogate for something we care about much more, which is avoiding interval colon cancer mortality. We're trying to protect our patients by removing polyps, but we're doing that to try to prevent colon cancer five, 10, 15 years down the road. There's a gigantic study being arranged out of Norway, in fact, by part of the same team that had recently published the somewhat controversial Nordic study in the New England Journal. They're looking at over 200,000 patients undergoing colonoscopy with and without AI to determine if there's a measurable diminution in cancer mortality after 10 years. You can look out for this trial in almost a decade from now, we'll have really wonderful quality data. Certainly, a lot may change over the course of the next decade, but we applaud the investigators for taking on such a Herculean task. We have to also recognize that there are going to be other types of trials that are coming out looking at computer-aided polyp detection, because now that these systems have often reached regulatory approval in a variety of countries, we're now into a phase of implementation research. What that really means is we're trying to understand how these technologies will actually work in the real world, outside of the artificial investigational or trial environment. Already a number of trials are being published, which are sometimes observational, sometimes retrospective. One's ability to control variables in these studies is diminished compared to a randomized control trial, but the trials are still informative and important to pay attention to. One of the trials that got quite a lot of attention fairly recently was out of Israel. This was a retrospective observational study that looked at 4,000 patients who underwent screening colonoscopy before and after AI adoption. In fact, the results were quite negative. In fact, beyond negative, it suggested that AI diminished endoscopist performance, perhaps by behavioral differences, faster withdrawal. Hard to know why that could occur. It's hard to really create a sensible explanation for that, but a subsequent analysis and letter to the editor questioned the possibility of unbalanced disease prevalence in this interesting post-COVID period, where there are really different groups of patients who are coming in at the tail end of COVID and then waiting for another year. It might be that these are really different groups of patients and perhaps with potentially different polyp prevalence, but this is the strength and weakness of observational research. We do need to understand how these technologies work in a real world environment, but the research will be prone to certain biases and errors that may be hard to understand and will never be quite as clean as a randomized trial. Now, a lot of our focus has been on computer-aided detection, but that's just the very beginning of our pathway for machine learning and GI endoscopy. In the future, we'll certainly be using computer-aided diagnosis for polyp to provide a pathologic definitive diagnosis for what the polyp is, or perhaps to diagnose Barrett's dysplasia. And we'll also be moving towards automated reporting. The most exciting thing in the more distant future is trying to identify novel insights from endoscopic information. For instance, there's a study looking at disease prediction, whether an ulcerative colitis patient is likely to soon have a flare or not, so with an opportunity to change the medication before the flare even occurs. These are some of the really exciting promises of machine learning in endoscopy. Looking at computer-aided polyp diagnosis specifically, we now have multiple published papers that demonstrate that computer-aided diagnosis for colon polyps can exceed novice performance in differentiating hyperplastic versus adenoma, it can match expert performance for optical diagnosis, and it can also meet PIVI criteria to allow for resect and discard. So I think there's a real possibility that having computer-aided detection may increase the confidence of endoscopists to consider removing and throwing the polyp out, discarding it without sending it to pathology, at least for small polyps. And in fact, there's good evidence that this approach works and is reliable and is safe. And so the question is whether computer-aided diagnosis that supports a resect and discard strategy could be a really important unlock for encouraging AI adoption, because it could create a pathway for cost savings. And the concept here is that in the US healthcare system, there are a variety of bundled care models across several different fields. One of these could potentially incentivize gastroenterologists to leverage computer-aided diagnosis, which would lower the cost of pathology and polypectomy. So encouraging a diagnose-and-leave or diagnose-and-discard strategy for small polyps, I think that this really likely needs to be supported by innovation on the payer side to allow practices to consider adopting this type of model, because otherwise, reducing pathology and reducing polypectomy may cut into revenue streams that are important for a variety of practices. Now, our main focus has been on computer vision. We have to acknowledge that, in fact, the world of machine learning and gastroenterology is much larger than just computer vision. In particular, there will be algorithms that start occupying our electronic health records that may provide guidance in inflammatory bowel disease care, whether or not somebody needs a biologic or predicting the need for surgery, or algorithms that help with GI bleed management or triage. So this is going to be a more subtle way that machine learning is introduced in our lives when there are recommendations from a machine learning system as to what to do. The critical thing is that we still have to use our judgment and expertise as physicians in deciding whether or not to take a recommendation, and the nuance of how physicians incorporate machine learning recommendations in their practice is going to be a critical part of understanding machine learning going forward and a critical concept for our field. And the idea, basically, is that while in the media, when we talk about physicians and AI, often newspaper and magazine articles set the concept up as physician versus AI. And that really, I think, is not the interesting question. The much more interesting question is to understand how physicians work with AI and how that can alter and potentially improve our practice. A really nice example of this is a study that looked at humans, endoscopists, getting to use a computer-aided diagnosis system, but endoscopists were still free to make their final decision based on their insight and the output of the AI system. And the result was that endoscopists could achieve a higher level of performance when taking into account the AI, so a human-AI hybrid decision that outperformed the human alone or the AI alone. And that, again, is, I think, in fact, very reassuring for physicians. It means that our insight into using tools and when to ignore a recommendation or when to adopt a recommendation, that insight is still deeply important and very powerful. As we finish off, I think while AI tools may sound intimidating, the truth is we're accustomed as gastroenterologists to thinking critically about adopting new technologies in our practice. And I think the rubrics and considerations for adopting AI tools are actually very, very similar to adopting any other tool. And that means that we have to consider the technical aspects of the tool, how it compares to the gold standard, how it works, is it interoperable, does it work with all of our scopes. We have to consider the clinical impact, what are the relevant clinical outcomes for this condition and how does this device impact that for the population we take care of. And then usability and cost are also critical considerations. And so the reality is we're going to use the same type of mechanisms to evaluate whether or not to adopt an AI tool as we would to consider any other endoscopic device. And as we finish, I think that all of us need to think critically about where we want our practice to be located on the innovation adoption curve and where we should be as physicians where we're comfortable on the innovation adoption curve. At the moment for AI, I think in really the early adopter phase, there are many practices across the country that are beginning to use AI polyp detection, but it's not the majority. And often there is a chasm. There's a wait, a year, two years, three years, and it's a long wait to see broader adoption. And that broader adoption often involves updating practice guidelines. It often also includes cost considerations as more technologies enter the market. Sometimes the costs become more possible to consider adoption, but nonetheless, we have to make decisions for ourselves and for our patients about where we want to be on this adoption curve. Again, we make similar decisions for any piece of AI technology. Gastroenterology practices are expert in making these types of decisions. And I think we need to use our expertise in technology adoption to make the same types of assessments for AI and machine learning tools. Thanks very much.
Video Summary
The video transcript discusses the clinical evaluation of machine learning algorithms in gastroenterology, focusing on computer-aided polyp detection and diagnosis. The speaker highlights the need for evidence to support the adoption of these tools and discusses potential benefits and considerations. The field of gastroenterology leads in evaluating AI tools, particularly in polyp detection. Studies show improvements in adenoma detection, especially for small adenomas. Challenges remain in detecting larger polyps, particularly sessile polyps in the right colon. The transcript also touches on future applications of AI in endoscopy and the importance of physicians working with AI tools to enhance patient care.
Asset Subtitle
Tyler Berzin MD, MS, FASGE
Keywords
machine learning
gastroenterology
polyp detection
AI tools
endoscopy
×
Please select your language
1
English