Deepomatic takes a closer look at cancer using its smart image recognition technology
Everybody knows someone who has been diagnosed with cancer. Patient survival chances are immensely improved when the cancer is detected and treated early. Effective screening is therefore the key. Unfortunately, the evaluation of screenings can be cumbersome, time consuming, and prone to errors resulting often in unnecessary and invasive treatments with possible painful side effects. This is where AI comes in.
At Deepomatic, we strongly believe that AI can assist doctors in their patient diagnoses with real-time screening analyses that are just as accurate as those performed by a trained professional. The automatic analysis will not only save time, but enable diagnoses in remote places where trained personnel are less readily available.
Based on this premise, Deepomatic embarked upon a very promising project with Light for Life Technologies (LLTech), a French start-up which is developing a scanner for in-depth microscopic tissue imaging. Images from a tissue sample are captured within mere minutes, hence dramatically reducing the waiting time for results. In addition, there is no need for any kind of tissue preparation, modification, or staining, allowing the reuse of the biopsy for further analyses.
Let’s take a closer look at how we used our image recognition platform to help us understand the implications of deep learning on diagnosing cancers.
Create a dataset of labelled cancer images
LLTech provided us with 18 images of biopsies containing cancerous cells and 122 images of biopsies without any abnormalities. The abnormal regions within the former images were identified with polygon annotations using Deepomatic’s annotation tool.
Since the abnormal regions are much smaller (<1% of the image area), the images were split into 256 pixels times 256 pixels sub-images to improve training performances. This approach also had the advantage of removing pure background sub-images (without any tissue) from the dataset and of locating abnormal tissue regions without a trained detector algorithm. The sub-images were then grouped in two categories: healthy and cancer.
For the healthy category, only sub-images with more than 50 % of tissue of the 122 biopsies were chosen. For the cancer category, the sub-images from the 18 abnormal biopsies containing more than 40 % of cancer were used. To overcome the issue of biased training through a very unbalanced dataset such as the one that was available to us here, the cancer category was augmented by randomly cropping sub-images in the annotated regions and removing sub-images from the healthy category.
The final dataset contained 5319 sub-images in both healthy and cancer categories from which 25%, i.e. 1330 randomly chosen sub-images, were used for testing the algorithm’s performance.
Train a custom model to diagnose cancerous tissue
Deepomatic’s platform has been developed to easily train custom models. What does this mean exactly? It means that the models can be trained to identify any concept present in images uploaded to the platform. Back to our mission — after uploading the pre-processed images and splitting them into the above described labels as well as training and test datasets, a pre-trained GoogLeNet architecture was chosen to be fine-tuned on that dataset. Without any other pre-processing or tweaking of the algorithm, a 89% accuracy of classifying healthy tissue and a 93% accuracy of classifying cancerous tissue were obtained after only 10 training epochs. In the latest version of our software, different pre-trained algorithms (AlexNet, ResNet, GoogLeNet) can be chosen and their corresponding hyper-parameters such as learning rate policy and decay rate, can be modified to improve performances.
Learn to detect cancer, one image at a time
When training models to recognize particular concepts it can often be frustrating if you don’t meet your target performance rate, what’s even more frustrating is when you don’t know why. Fortunately, our tool allows users to easily identify the reasons behind unsatisfactory performance rates. In the cancer classification case, we could easily see that:
- annotations were not precise and informative enough,
- the background was an important feature for the algorithm and needs to be eliminated,
- the dataset is not diverse enough with 18 images from which less than 1 % of the sub-images were used for the training.
By controlling these variables, even if we simply doubled the number of images in the dataset it would drastically raise performance rates and make automatic detection a reliable and convenient option. What’s more, the major advantage of LLTech’s medical revolution is that the data is processed in real time; it can be sent by remote transmission directly to an artificial intelligence system to help with the diagnosis. Doctors could directly receive biopsy results in the operating room, thus avoiding a second intervention. There is no doubt that medical technology powered by deep learning will have a revolutionary impact in the diagnosis of cancers.
This solution is of course accompanied by challenges which still remain to be overcome. Annotation requires expert’s time, so obtaining high quality annotated datasets will remain a costly challenge for years to come. Also to be more accurate and useful we’ll need to train several classes to classify the cancer stages. The next step of our collaboration with LLTech is to train new algorithms on their new Dynamic Cell Imaging (DCI) which can provide complementary sub-cellular contrast to the existing imaging. Watch this space.
Thank you for reading.