Among the many applications of Deep Learning in healthcare, segmentation is undoubtedly one of the most studied, given the broad range of possible advantages that it could bring.

Nevertheless, segmentation is not a costless task: first of all, as in the majority of applications in the healthcare fields, obtaining high-quality images is not trivial; second, the tagging phase is insanely costly in terms of time and resources, especially compared to the labeling that has to be done when the task is classification or even object detection.

Training a segmentation model that also relies on other information would be a turning point for medical segmentation.

✅ Researchers propose a new vision-language medical image segmentation model LViT (Language meets Vision Transformer).

✅ Medical text annotation is introduced to compensate for the quality deficiency in image data

✅ Experimental results show that the model has better segmentation performance in both fully and semi-supervised conditions

✅ Currently, the proposed model is only experimented on 2D medical data

Continue reading the summary| Checkout the paper and github link

Source link