false
Catalog
ASGE Annual Postgraduate Course: Clinical Challeng ...
Annotation and Segmentation
Annotation and Segmentation
Back to course
[Please upgrade your browser to play this video content]
Video Transcription
So now we'll move on to Taposh who can share his slide and he'll be discussing annotation and segmentation. We've already had a good discussion about or starting a discussion around annotation and labeling and how important that is and of course partitioning all the images into the different pixels and talk about segmentation. So Taposh you can start, thank you. Hi everyone, this is Taposh Roy from Kaiser Permanente. I'm going to start with disclosures. I have financial relationship with Kaiser Permanente in the form of salary. Other than that I am also the author of notes on medical image processing and deep learning. This year again I'm serving at KDD as a senior program committee member as well as I am hosting a competition on multi-data set time series anomaly detection for univariate time series. In today's talk we are going to discuss annotation and segmentation. So we'll start with what is annotation, what are the types of annotation, supervised strategies and supervised strategies. So let's talk about annotation. So annotation is extra information associated with a particular point in a document or a file or any other piece of information. It can be a note that we can add or anything additional that we know that we can add to it. Annotation can be done for text, image, sound or any piece of information for that matter. So the next slide we are going to look at some examples of how would we annotate. So let's see these three things. So A, B and C. If we were to annotate these things, how would we annotate these images? So we would say A we can annotate it as black and white cow, B maybe it's a white dog and C we can annotate this as a white cow. So this is an example of providing additional information or annotating. Let's look at some more complex images. So here we see three images again. How would you annotate these images or what would you get information about these images? So all of these images have a couple of things, they have horses and they have a person riding. And if you were to annotate these images, you would annotate it in a special way. The first one is the game of polo, the second one is horse racing and the third one is also the game of polo. So what do we perceive at a glance and this is a paper from Fei-Fei Li and this is a very interesting observation there. So I put the paper here as well. Before we go into more details about annotation, I want to actually give us a caution. Our perceptual systems interprets the environment in a certain manner. We have blind spots to it. And there is some existing research that talks about the blind spots. In particular, I wanted to talk about McGurk effect, which is basically talking about audio and visual coordination. I'm going to play this quick video to see how it. Watch this experiment. It's called the McGurk effect. What do you think he's saying? If you think he's saying dah, dah, dah, you are in the grand majority. But now close your eyes and listen instead of watching. The voice is saying bah, bah, bah. Now let's replay it without the sound. Yes, it's true. His lips are saying, ga, ga, ga. So how is it that when the lips are saying, ga, ga, ga, and the sound says, ba, ba, ba, that our brains read, da, da, da? The McGurk Effect. It's an example of an auditory illusion. It's a great way to show that the accurate perception of information can involve the participation of more than one sensory system. In this case, vision with sound. We call it multimodal perception. So this kind of McGurk Effect or similar effect does come into picture when we are trying to annotate video images and stuff like that. But that's just a caution. That doesn't mean that we should not do annotation. It's just a caution that there is blind sides to our... Let's talk about image annotation a little bit more. So the way image annotation today works or we've seen the self-driven cars looking at images and annotating is they have enough information that have been boxed. So like here you are marking a pedestrian walking, you're marking a car, you're marking a van. So you put the bounding boxes around it and then you figure out that there are people walking, there is cars and then there is a van. So this is an example of again image annotation done in the TARS. So what are the types of image annotation? So the image annotation can be classified at a high level again into a whole image classification or object detection within an image or image segmentation. The idea of the whole image classification is to simply identify the objects and other properties that exist in an image. Object detection is to detect and classify a particular set of objects, individual objects within the image. Segmentation is to recognize and understand what is in the image even at the pixel level. So the image belongs to at least one of those classes. So I'll give you some examples. So right here the first thing that you see is first image is the image of two birds. Here we are classifying this as the image of two birds. Object detection enables us to identify bird one and bird two. So there are two objects here. And finally segmentation helps us to understand and segment multiple parts at a pixel level. So this is a bird, this is a bird, this is a branch, this is the background which are the leaves and stuff like that. So these are the different types of image segmentation that's currently being used. Let's come back to machine learning paradigms. So there are two bigger paradigms in machine learning that are supervised and unsupervised. Supervised learning uses the past information to predict the future such as classification and regression. Unsupervised learning tries to do without the tagging information. For example, clustering and anomaly detection are good examples of unsupervised learning. So let's talk about supervised learning a little bit more in this realm of things. Supervised learning, as the name suggests, it basically takes the additional data that are labeled data that we provide. This is very critical and important step so that we understand how the models will be trained on that particular data. One of the biggest challenges in healthcare is data, labeled data. When you accurately have a labeled data and both independent as well as the dependent variables being quantified and identified and tagged, it becomes very easy to develop models with it. So annotation and segmentation certainly really helps that and I'll give you an example about a data set in gastroenterology in the next few slides where a lot of annotation and segmentation has been done. And this is the work done in the paper Hypercavacer. It's a comprehensive data set that's been created with gastrointestinal images for endoscopy. The images from this data set are collected from the real examinations at Berham Hospital in Norway. The data sets contain about 110,000 images and 374 videos and represent anatomical landmarks as well as pathological and normal findings. The total number of images and video frames are more than a million. This is a published paper which you can read. Here the images are classified into upper GI as well as the lower GI tract. So the high level they are labeled into two different tracts. Within the upper GI tract, it's further separated into anatomical landmarks and then pathological findings. Whereas in the lower GI tract, in addition to anatomical landmarks and pathological findings, there is therapeutic interventions and then there is quality of mucosal views as well. I've put the link of this paper here. This is a super comprehensive data set which was released last year and it's been really helpful in machine learning to have this kind of information to understand the data. So this a lot of this data was manually annotated by senior physicians gastroenterologist. Now that's supervised. There's another example of unsupervised where we want to find out automatic patterns of things. Right. Before we go into there, I wanted to just take a small segue into automatic image descriptions. So there has been work done in non-medicine where especially here this paper by Andrej Karpathy and Dr. Lele Fefele where we basically create image descriptions out of an image. So you give an image and you create descriptions out of it. Like in this example, this is from the paper itself where basically it finally understands that this is a dining table with breakfast items and on that breakfast items there can be glass water with lemon. There can be a cup of coffee, a water bottle, a tablet and other things as well. So this is a paper from Stanford where in 2015 or six years from prior to this where we are today we were able to do this. The goal is to do something similar in medicine or in health care as well. One of the papers that did attempt this was a paper by Dr. Coimbra and his team. They talk about using automated segmentation, image segmentation and they use a couple of non-supervised algorithm clustering algorithms. These are like mean shift, normalized cuts and level sets on automatic classification of gastric tissues into multiple classes. In this case, they use three classes cancerous, pre-cancerous and normal. And they basically did use the segmentation technique where they use this, then they did a feature extraction and then they did a classification of into these three categories. It's a pretty good paper for reference. We are going to drill a little bit into these techniques. So at the high level the simpler techniques are the or older techniques are the mean shift, normalized cuts and level sets. Mean shift is as the name suggests, it's a well-known method for clustering analysis. The whole concept behind the shifting of the mean or mean shift is finding the mode basically and then shifting the mode. So here what I am trying to showcase here with this image is the red dots are the dots that keep on moving. Ultimately it comes to a point where this has already moved to something more meaningful. In this case, all these points move to something here which is the mean. Mean of it is the area of highest density. So this is one example of how clustering is done to understand that, okay, this is the main area or the point where there is a issue, for example, to look at. Another area to look at or another type of clustering to look at is normalized cuts. So normalized cut method treats image as a graph partitioning problem. So here in that example you see there are three different shades you see in this image. And then image B basically shows the combination of the first two shades. Image C says the second and the third one and image D is saying one and the third. So this is the image segmentation of the components of the participation. This was one of the strategies that was used, and still used in some of the research papers today for image segmentation. But I think the best in class is level sets, which is also used in that paper. Level sets is one of the biggest advantages of level sets method is that they provide a good measurement across the curvature. So an example here is that of a hand. And initially you see the way this model would work is it would identify that this color, this pixels are a different size than the pixels in the background. Then slowly it would contour around it and then you would identify it to be very close to a hand. This helps in finding the accurate position of an image and not only the position of the image but also the length, width and the surrounding curvature around it. So this has a lot of applications in especially image processing for non-supervised way. And the whole idea with this is there are more complex algorithms than this actually. Once we use those kinds of algorithms, we would be able to get more automatic images identified. I mean there is a little more work needed for medical image processing for these to be used. I'll give you an example of three methods that are kind of the state of the art right now. DB scan, HDB scan, hierarchical DB scans and topological data analysis. All three of them are unsupervised clustering methods. And these methods can be used right now there is very less research of these being used in medical image processing but these can be used in the future to provide you better information about what is there in the images. With that I'd like to conclude my talk and we can discuss more during the discussion session. Thank you.
Video Summary
In this video, Taposh Roy from Kaiser Permanente discusses annotation and segmentation. He starts by explaining that annotation is extra information added to a document or file, and it can be done for text, images, sound, or any other type of information. To illustrate, he provides examples of annotating images, such as labeling a black and white cow or identifying different objects and activities in images. Taposh also discusses the concept of blind spots in our perceptual systems and cautions that this can affect image annotation. He then delves into image annotation techniques, highlighting whole image classification, object detection, and image segmentation. He explains that image segmentation involves recognizing and understanding the content of an image at the pixel level. Taposh also touches upon supervised and unsupervised machine learning paradigms and their applications in healthcare. He mentions a comprehensive dataset of gastrointestinal images that was manually annotated by senior physicians. Finally, he discusses some clustering algorithms and their potential use in medical image processing. The video concludes with Taposh inviting further discussion during the Q&A session. No credits were mentioned in the transcript.
Asset Subtitle
Taposh Roy
Keywords
annotation
segmentation
image annotation
image segmentation
machine learning
×
Please select your language
1
English