false
Catalog
ASGE Annual Postgraduate Course: Clinical Challeng ...
Computer Vision: Basic concepts
Computer Vision: Basic concepts
Back to course
[Please upgrade your browser to play this video content]
Video Transcription
Let's move on to the next session, which is talking about computer vision in endoscopy. And Yuichi, our apologies, I think we may have already shared some of the CADI, CADX data already, but I think that still is one of the most sort of, not just exciting, but the first applications that we are seeing in practice. So over to you, Yuichi, about talking about computer vision and the basic concept of computer vision. Yuichi, you're welcome. And we'll go over to your talk. Good morning, everybody. It's my great pleasure to provide a talk on computer vision from endoscopy's perspective today. Thank you very much for your invitation, Prateik. I'm so excited. So these are my COIs. And first, I'd like to introduce myself. I'm a Japanese gastroenterologist working on AI research in endoscopy for the past seven or eight years. I do have a basic understanding or knowledge on AI. However, I'm not an expert in this field. Therefore, I'm maintaining a close network with the AI engineers, regardless of whether they are working in academic sectors or in industrial fields. And also, we, as a gastroenterologist, are expected to play an important role in the multidisciplinary project in the field of AI. Therefore, appropriate understanding of leadership is also very important for us. And let's get back to the title, which is the computer vision. Here I'm picking up a nice definition of vision, which was proposed by Professor David Murr, who is the British neuroscientist. According to his definition, vision is basically a computational test. If we follow his definition, I guess the machine learning or AI has a really important role because it can provide optimization of the computer vision with useful data to mimic the way that humans do. Here I'm presenting the examples of computer vision in endoscopy field. It includes the edge detection, optical flow, image stitching, and image processing field. Also, we have the computer-based 3D reconstruction model. And also, the role of local features are attracting increasing attention because these features are now used for object detection or recognition, such as deep learning methods. So I think the role of the AI or deep learning are really important in computer vision, whether we use the image processing or local features. And we should define the role or clinical roles of AI in endoscopy. Here I'm picking up the three roles. The one is the computer-aided detection, which apparently helps us detect the polyps or diseases. And secondly, we have the computer-aided diagnosis, or CADX, which provides a high-level prediction of the pathology of the polyps or the diseases. Now we have the emerging concept of computer-aided quality assurance, or CADQ. CADQ provides, for example, the withdrawal speed recognition or identification of the blind spots during the withdrawal, all of which are kinds of the quality indicators during endoscopy withdrawal. I think the basic understanding of AI is very, very important. And to the best of my knowledge, it's very simple. There are roughly two components. One is the AI algorithm. The other one is data. And these two things are really important and work each other really closely. And I would say these two things correspond with the brain and experience in human beings. So this is the basic understanding on the AI. And also, we should understand the position of deep learning. Deep learning is just a part of machine learning, and the machine learning is a part of the artificial neural intelligence, which is a model designed to provide a high performance for specific tasks. But unfortunately, we don't have any artificial general intelligence or a strong AI, which provides a perfect performance regardless of the kinds of the tasks. But I think in the coming future, maybe we will get some artificial general intelligence with the help of, let's say, reinforcement learning or something like that. And when it comes to deep learning, we should identify the difference between a conventional machine learning methods and deep learning. And when it comes to conventional machine learning methods, image features should be selected or picked up by the investigators manually. Therefore, investigators are asked to have some expertise on clinical and engineering fields. However, when it comes to deep learning, the best features are automatically learned by the network with use of the data and networks. Therefore, we don't need any clinical expertise on picking up the relevant features from the image or endoscopic videos. This is the biggest difference between these two models. When it comes to the performance, apparently, deep learning outperforms conventional machine learning methods. This is a very famous figure provided by ImageNet Challenge. However, we should keep in mind that deep learning does not always outperform conventional methods. I like this figure. If you don't have many, many images for machine learning, sometimes conventional, old, classic methods outperform deep learning. But once you get a number of images, I'd say deep learning is a king in terms of the performance of the algorithm. So the number of the images matter when it comes to endoscopy machine learning model. So how to get this kind of huge data? There are roughly three measures. Number one is very important, and this is maybe the only way to get a model work excellently, which is just do it. To collect the data with use of your institution, with use of your patient's kindness, with use of your network, with use of your communication skill with the companies or other partners. So this is a way to go. And once you get the data, maybe you can have the two remaining measures to increase the volume of the images, including the transfer learning and data augmentation. And once we get the data, what we should do next is separate the data into three categories. One is the training data, and the other two includes the validation data and the testing data. The training data is apparently used to train the model, and the validation data will be used for fine-tuning the hyperparameters to optimize the algorithm. And the testing data will be finally used to evaluate the model performance. But how to adjust the model? Adjustment of the model will be done between the communication of the training and the validation model. And during this process, hyperparameters will be adjusted, including the model hyperparameters, such as the number of layers, and training hyperparameters, such as the learning rate. And the final validation will be done after completing this validation process. So and also, we should understand the difference between the overfitting and the generalizability in terms of the accuracy of the model. So during the validation and testing process, we will get two kinds of score. One is from the validation process. The other one is from the testing process. And the important thing here is to compare these two scores in the same way. If you get a high score from both validation and testing set, this is going to be the generalizable model, which is fine. However, if you get a high score from the validation process while low score from the testing process, this means that the model is overfitted to the training data. And this kind of overfitted model doesn't work well for the external environment. So you should be taking care of this kind of difference between the overfitting and the generalizability. Let me introduce the examples of the generalized model. In the right side, you can understand the perfect retract poly with use of the algorithm. However, if you use the overfitted model, you may face with this kind of very confusing disrupting outputs from the model. So this is a very representative image of the overfitted model. However, in order to appropriate understand the difference of the generalizable model and the overfitted model, I think the ultimate solution is to conduct the clinical trial testing. This is the, I think, mandatory step to comprehend the performance of AI ultimately. Here I'm presenting three don'ts in machine learning, namely patient-level overlap, data leakage, and exclusion of real-world data. Now let me introduce one by one. The first, patient-level overlap, this is regarding the images used for machine learning. You shouldn't use images coming from the same patient in training validation or testing data. Patient-level overlap should not be found in each data set. Otherwise, the model will be overestimated or overfitted to the training model. And the second one is the data leakage during evaluation. So we should adjust the model to optimize the performance. This process should be done between the training and validation process, not between the testing process. So once you've completed the adjustment of the model, you should proceed to the testing process. However, you should not test the model twice or more because if you adjust the model based on the result from the testing, it means that that model will be overfitted to the training data. Number three, exclusion of real-world data, which is also very important. So I'd like to pick up some example. If you find some paper which mentions that the developed AI provided 96% accuracy, which is fine. However, you should be very careful when interpreting this kind of data because sometimes the researchers use only high-quality data for both learning and the validation process. And in that case, AI or deep learning works excellently because the model is overfitted to the learning data. However, if you look at the real-world endoscopy practice, there are a lot of low-quality images. We shouldn't ignore this kind of low-quality image anyway. Therefore, we should include the low-quality images at least into the testing material to avoid overestimation of the model. This is also very important. So finally, we do need to optimize the dataset. This includes the quantity, variety, and the reliable labeling process. As I mentioned before, the quantity is the most important factor. And once we get a model of the data, we should focus on the variety of data because it contributes to the robustness of the model. This includes the endoscopy-level variety, hospital-level variety, and we do need data from multiple countries to make sure the variety. And also from the patient perspective, we should include the patient with the different ethnicities, different ages, and different sexes. And of course, the bowel preparation differs according to the data. And also, in terms of the image source, we have two options, including the static images or videos. Usually, videos are preferred because of its magnitude volume of the data. And also, we have the different endoscopes. We have the different corporations which provide the processors. And it is very important to understand the representativeness of the data. So we should take care of the prevalence of the polyps or diseases, size, or distribution of these things in line with the real clinical practice. And finally, we should make sure that the labeling process is really done in an appropriate way. I'd like to mention this in the coming session. So this is the final slide. Here I'm presenting a nice textbook for GI endoscopists who are eager to learn AI in endoscopy. This paper is designed to provide a basic knowledge on AI in endoscopy published in GUT in 2020. So if you have any questions or the enthusiasm, please check it. Thank you very much for your kindness.
Video Summary
In this video, Yuichi, a Japanese gastroenterologist working on AI research in endoscopy, discusses the concept of computer vision in endoscopy. He explains that computer vision involves using machine learning and AI to optimize the analysis of images and videos in a way that mimics human vision. He highlights various examples of computer vision in endoscopy, including edge detection, optical flow, image stitching, and image processing. He also discusses the role of AI in endoscopy, specifically computer-aided detection, computer-aided diagnosis, and computer-aided quality assurance. <br /><br />Yuichi emphasizes the importance of understanding AI basics and the role of deep learning in endoscopy. He explains that deep learning automatically learns the best features from images and outperforms conventional machine learning methods, especially when there is a large dataset available. He discusses the importance of collecting and separating data into training, validation, and testing sets, and the need to avoid patient-level overlap, data leakage, and exclusion of real-world data during the machine learning process. He concludes by recommending a textbook for GI endoscopists interested in learning more about AI in endoscopy.
Asset Subtitle
Yuichi Mori, MD, PhD, FASGE
Keywords
computer vision
endoscopy
machine learning
AI research
deep learning
×
Please select your language
1
English