false
Catalog
ASGE Annual Postgraduate Course: Clinical Challeng ...
Clinicians Trust in AI, Fairness and Bias-Why is i ...
Clinicians Trust in AI, Fairness and Bias-Why is it Important?
Back to course
[Please upgrade your browser to play this video content]
Video Transcription
Our last speaker of the morning session, it's my pleasure to ask Dr. Sravanti Parasa to talk to us about clinicians' trust in AI, fairness and bias, why is it important? Dr. Parasa is a practicing gastroenterologist at the Swedish Medical Center. She has specialized training in epidemiology and biostatistics. She's a top member of the ESG Task Force for AI implementation and she's really a force of nature when it comes to this AI topic. She's very passionate about it and I know that whenever the question in my mind comes about AI, she's my first person to go to, Dr. Parasa. Thank you for that kind introduction. Everybody in the back can hear me? Yes. Okay. All right. You can be quiet for this session, okay? So we need some audience participation. So my topic is on clinicians' trust in AI, bias and fairness, and why is it important? So I got that image from today morning, actually. So far we've been talking about all the fun things about AI and what we can do. So now I'm going to talk about the dirty secrets of AI and why we need to know how AI works so that we can adapt to it as clinicians. These are my disclosures. And this is where I want you to participate. So what do you see on the slide up here? Anything is fine. Nothing? It's a watermelon, right? Can you give me one more descriptor of this image? It's a sliced watermelon. One more. Okay. All right. Okay. All right. Now, that's probably what most people guess, watermelon, juicy watermelon, watermelon seeds, watermelon juice or slices and stuff. Now what do you see? So did we ever recognize that there is a yellow watermelon? We never said it was a red watermelon, but we say this is a yellow watermelon. This is the point I want to bring about in my talk is when we are talking about bias and fairness and so forth, we need to recognize different populations, their needs and how we go about that. So anyway, this is a yellow watermelon, it's a juicy watermelon, it's watermelon with seeds, it's watermelon slices. The exact same thing, but it's in a different color or a different color. So I'm not going to go over some of the, you know, real big issues in this 15 minute talk, but several of the speakers prior have already talked about the bias that exists in AI and how AI works. It is not intended to create bias, but it basically perpetuates the bias that we introduce into these data sets. Now as clinicians, when we are deploying these algorithms onto our patients and so forth, we need to understand how these algorithms work and because that's how we do as clinicians, right? When we read a paper, a scientific paper, a statistical paper, we want to know if this particular drug works for my patient who's sitting in front of me. Have there been patients from this particular category that were involved in the clinical study when we do this? The same thing applies for AI-based clinical studies as well. So my topic today will focus on some of that education so that it opens eyes as to what we need to look for when we are doing or reading a clinical paper. So this is one of the landmark papers by Zayed Obermeier. Basically the first paper there says that it was a very widely used hospital system that hospitals use to predict which patients might be needing better care in the next few months. And the way they developed this algorithm was basically looked at the cost of how much this particular X population has a higher cost of hospital charges in the next six months or so, in the next two years actually in the study. And then they thought, okay, once we identify these based on retrospective data, these high cost patients, can we provide them extra care, reminders, nurse navigators, so that we can give them better care? The problem was they based the whole data on the hospital charges or cost so far incurred by that specific population. It so happened that most of the patients who got this extra care and extra love were white patients because they were charging more. Just because they have higher cost doesn't mean that they have higher severity of the problem. And this was a very rampantly used AI algorithm. And there are several examples if you look through history, including courts and judges and it's the compass algorithm and so forth. Again, same thing, different skin colors, different pulse oximetry, risk prediction, hiring people, it exists in all fields of AI. Now when we as clinicians are looking at different data points or AI algorithms, what should we look for? One is relevance of the use case. Is this use case relevant to my population? Yes, we are talking a lot about polyp detection, but is that problem relevant in Africa? Probably not. They won't relate to any of this. So when you're having an algorithm that's defining a specific use case, you need to know if it's relevant for your population that you're treating. Second thing is understanding the algorithm. I know we are not, you know, computer scientists to understand every nuance of the algorithm, but we need to have some basic understanding as to how these algorithms work. What are the metrics that we need to look for, whether it's performance metrics related to the algorithm itself or the clinical measures that we are trying to measure. And then, of course, we just took the watermelon example, so now we understand a little bit of bias and why we should have the data transparency, it becomes important. And then I'll touch a little bit about trustworthy AI and a little bit of regulatory issues. So when we are talking about bias, and again, think of it like a clinical problem. Let's say you have a paper in front of you saying that the area on the curve is 90.9, the sensitivity is 99%. Yes, we are energized, we know that this algorithm works, but we need to look at the finer details of the algorithm. So are we actually doing this algorithm for, you know, tubular adenomas or saturated adenomas or what's the size of the polyp? Whatever your question is, is that data relevant to the question that you want to solve? The second thing is, okay, you're telling that this algorithm seems like it will work on tubular adenomas and SSAs, but is the data representative of the claim that we are making? That's the second point. Of course, there could be bias in the training data itself. And then when the data scientists are actually building these algorithms, they've picked features from our data for optimization and so forth. So that is what we call algorithmic bias. It's nothing to do with the data. It's how you process the data and how you build the algorithms. And then what are the predictions that this whole pipeline is doing? So these are different stages where the bias can happen. So we'll talk a little bit about the relevance of the problem. So we're starting with the problem. What's the relevance? It's, you know, I don't know what's happening with my slides, but it's gone dark. So we'll talk about data. We'll talk about model metrics. When I'm talking about model metrics, it is basically, it's your DICE similarity coefficient or your Matthew MCCs or your sensitivity or the confusion matrix, right? Those are the parameters that data scientists and the statisticians are looking at. And then we need to understand, okay, today I have one algorithm. Tomorrow I will probably onboard five algorithms on the colonoscopy that I'm doing. How do we understand who is, you know, monitoring the efficacy of these algorithms for this particular set of patients? My patients might be completely different from patients who are elsewhere in the country, right? So that's where model governance comes into place. And then to understand some of this, again, I'm not going to give a whole machine learning talk or anything, but I want us to understand that there are certain basics as to where we need to understand where the training data is coming from. So we need to absolutely make sure it is representative of your data. And then is the algorithm validation also happening within similar parameters that your patients represent? So, of course, when we think about algorithm design specifications, we need to understand if there's a class imbalance. Let's say I have a thousand images of polyps, only 900 of them are tubular adenomas, 50 of them are SSAs, and 20 of them are hyperplastic. That's a class imbalance to say that this algorithm will work for SSAs, right? So when we look into the data, that's when we realize these imbalances act. And there are certain statistical and machine learning principles as to how you can balance that information. Again, when variability of data, right now maybe I have 50% of my patients with SSAs, but tomorrow I'm moving to a different practice, I have only 20% with SSAs. So the variability of the data plays a role in the robustness of the performance of the AI algorithm. How was the algorithm trained? What was the performance? How was it validated? What were the clinical measures that were the endpoints for this validation? Now understanding the algorithm is also very important, right? So the first picture is a perfect throw. This is my algorithm, and that's my outcome. It's perfectly false. Now as I said, I moved from West Coast to somewhere in the South, and my whole data has changed. So that's what we call data drift. If the model is not robust enough, you will have a bad outcome. The second thing is maybe this algorithm, what is called concept drift, is maybe this algorithm was trained for just tubular adenomas, but now I want to use it for SSAs and maybe some other weird complex things that I've never trained the algorithm with, right? So that's a concept drift. Now this is one of the very popular paper in bias where you're not just talking about bias within the data and the algorithmic bias and outcome bias and so forth, but also at a societal level. We all know those biases exist. So choosing the problem, maybe the problem, as we said, is not relevant to a certain population. So we should make sure that we are working on all those problems from a justice standpoint. We are collecting data from all possible real world implementation, because you might want to do it with the best quality experts, but in reality, the real world is really messy. So we want to keep it as close to reality as possible. And the definitions of outcomes of what we are saying that it will work for also need to be considered. And always post-market analysis to see how effective these algorithms are in real life for your patients is very important. So this is one way where we can deal with, you know, let's say you want to build an algorithm, you're all charged to have this data, you're recording all these videos, and probably getting some EHR data as well. But before you go into the, give that data to your data scientist or computer scientist, you want to make sure, you want to make an audit on that data to see if it is resolving the problem that you want to solve, meaning I want to say this algorithm works perfectly fine in a patient's with 20% background prevalence of XYZ, right, whatever question you are answering. So that would be a good audit from when you start building these models. Now one of the things I want to talk about is trustworthy AI, how do we trust these algorithms? One of the things is explainability. Now in this GIF, what you see is, you see a frame with a polyp, then there's the segmented lesion, that little white thing on black, and then there's the black box. Essentially it's saying, okay, right now we are at the black box model, it says this lesion has a 95% likelihood that it's a tubular adenoma. But what is it exactly doing in the back end? I mean, there are different, you know, opinions about the role of explainability in medicine and AI, not saying it's the holy grail, but having an understanding of how these algorithms work is important for clinicians as we explain to our patients. So what this explainability model essentially here is saying is, this is a polyp, this region is abnormal because it looks like these other images on which I use my data set to train. Now this helps me understand that, okay, it makes sense, it's not making, you know, weird assumptions that it's pulling an image of a secum and saying that it's a polyp, right? So that's how you probably, it's one way of how you can trust an algorithm, but just something to keep in mind. The next one is transparency. So whenever you see a paper or whenever you want to publish a paper, you want to know what the data is made of. So let's say this algorithm was trained on X SSAs, so many TAs, so many centers, so many things. So you can know the polyps or the images that you're getting or the videos that you're getting. The second piece of the puzzle is what kind of patients, was there a demographic role? I was just talking to Scott, he's not here, but there could be apparently even from a polyp image, you can actually know the race of the patient, which I personally, I would never know unless I saw the patient. So there could be some reverse engineering where you can get to these patients, but bottom line is we need to know some of these demographics. Now, let's say now we have all these algorithms, we know what's going on, we are supercharged, I know this algorithm works for my patients, I have the right data, the right product. Now you have five products that are stacking up, and as a clinician or as a director of endoscopy, you may want to go and see, is this really making sense 10 months later? So that's where your model governance comes into play. This is just an example of how this model pulled that image, and it's an explainability platform where you as a clinician can just click on the image and try to understand what parameters it pulled. Was it a similarity coefficient that it's basing its model prediction on, or was it some other image or what it is? It's a one snapshot, and these are the products that will come into the market in the near future as more and more models come there. We talked a lot about data security. It's very important, so obviously work with your IT team, your security team to understand, and again, a lot of times it might not be the clinician's responsibility to make sure of these privacy issues, but it's always important to keep in the loop, because that's the questions your patients will ask you when you talk about AI. A little bit about FDAs, and we talked about different regulatory boards, but this is, I leave this talk at this point with these efforts that are not just related to GI or medicine but across the board as to how we can bring ethical and bias and fairness issues into AI and the different national and international organizations that are working in this space to give us some guideline and framework of how we do these things. All right. Thank you so much.
Video Summary
Dr. Sravanti Parasa discusses the importance of clinicians' trust in AI and addresses the issues of bias and fairness in AI algorithms. She emphasizes the need for clinicians to understand how AI works and adapt to it in order to provide better care to patients. Dr. Parasa highlights the existence of bias in AI algorithms, stating that they perpetuate the biases present in the data sets they are trained on. She gives examples of algorithms in healthcare that have exhibited bias, such as predicting higher healthcare costs for white patients due to the biased data used in their development. Dr. Parasa outlines key considerations for clinicians when evaluating AI algorithms, including relevance of the use case to their patient population, understanding of the algorithm and its metrics, data transparency, and the need for trustworthy AI. She also touches on the topics of model governance, explainability, transparency, and data security. Dr. Parasa concludes by mentioning the efforts of various organizations in developing guidelines and frameworks to address ethical and fairness issues in AI.
Asset Subtitle
Sravanthi Parasa, MD
Keywords
trust in AI
bias in AI algorithms
clinicians and AI
algorithm bias in healthcare
ethical guidelines for AI
×
Please select your language
1
English