Purpose: To evaluate the robustness and redundancy of radiomics features extracted from CT images.
Methods and Materials: 28 metastatic clear-cell renal carcinoma patients were enrolled, before therapy initiation. Tumour was manually delineated by three expert physicians. Three categories of features were extracted: shape descriptors, intensity histogram statistics and textural features. Impact of filters and grey-level scales on features was studied. Concordance correlation coefficients (CCC) and inter-class correlation coefficients (ICC) were calculated to assess the reproducibility. Spearman correlations were performed to assess feature redundancy.
Results: 1564 radiomics features were extracted from each image. Different filters had little effect and grey levels no significant effect on extracted radiomics feature values. 231/1564 features showed high reproducibility for ICC (ICC≥0.8), and 198/1564 for CCC (CCC≥0.9). Features with an ICC≥0.8 and CCC≥0.9 were considered the most robust. This step reduced the number of relevant features to 158. Among these, highly correlated features with correlation ≥ 0.9 were removed. This procedure yielded 23 features both robust and independent.
Conclusion: This study allows understanding feature stability and redundancy, and impact of pre-processing filters and grey-scales levels. These steps are mandatory to subsequently use radiomics features for prediction of therapy response and outcome in oncology.
Purpose: To characterize, through a radiomic approach, the nature of areas classified PI-RADS 3/5, recognized in multiparametric prostate magnetic resonance with T2-weighted (T2w), diffusion and perfusion sequences with paramagnetic contrast.
Methods and Materials: 24 cases undergoing multiparametric prostate MR and biopsy were admitted to this pilot study. The clinical outcome of the PI-RADS 3/5 was found through biopsy, which found eight malignant tumours. The analysed images were acquired with a Philips achieva 1.5T machine with a CE-T2-weighted sequence in the axial plane. Semi-automatic tumour segmentation was carried out on MR images using 3DSlicer image analysis software. 45 shape-based, intensity-based and texture-based features were extracted and represented the input for pre-processing. An evolutionary algorithm (a TWIST system based on KNN algorithm) was used to subdivide the dataset into training and testing set and select features yielding the maximal amount of information. After this pre-processing 20 input variables were selected and different machine learning systems were used to develop a predictive model based on a training testing crossover procedure.
Results: The best machine learning system (three-layer feed-forward neural network) obtained a global accuracy of 90% ( 80 % sensitivity and 100% specificity ) with an ROC of 0.82.
Conclusion: Machine Learning systems coupled with radiomics show a promising potential in distinguishing benign from malignant tumours in PI-RADS 3/5 areas.
Purpose: To compare the performance of an expert radiologist to that of quantitative radiomics for prediction of local resectability of pancreatic ductal adenocarcinoma on routine abdominal CT.
Methods and Materials: We included 66 patients (m:f=32:34; range 35-82 yrs) with histologically proven pancreatic ductal adenocarcinoma who were operated within 4 weeks of an initial routine portal-venous phase multidetector-row CT examination. An expert abdominal radiologists scored CT data for tumor resectability. Another expert abdominal radiologist drew tumor contours to obtain a volume of interest, from which we computed 90 intensity, shape and texture features. During training the feature vector was reduced with a feature selection algorithm and combined with several classifiers to predict local respectability using radiomics.
Results: There were 43 hypo- and 23 iso-attenuating tumors, of which 29 were resectable and 37 non-resectable. The best classification result was obtained with a logistic regression classifier with the feature vector reduced by regularized discriminative feature selection. Accuracy for predicting resectability was 67% for the radiologist, and 83% for radiomics. Sensitivity, specificity, positive and negative predictive value for resectability were 90%, 49%, 58% and 86%, respectively, for the radiologist and 79%, 86%, 84%, and 82%, respectively, for radiomics.
Conclusion: Quantitative CT-based radiomics for prediction of resectability of pancreatic adenocarcinoma on routine CT may outperform expert radiologists with respect to accuracy and positive predictive value but at a lower sensitivity.
Purpose: The objective of the study is to analyse MR images acquired before CyberKnife treatment, to predict the response and avoid unnecessary radiosurgery for patient.
Methods and Materials: T1-weighted MR images of 38 patients presenting an acoustic neuroma treated with CyberKnife® at the CDI (52.6% responders with volume reduction) were selected and analysed. Analysed images were acquired on 1.5T machines with contrast-enhanced T1-weighted sequences in the axial plane. Semi-automatic tumour segmentation was carried out on MR images using the 3DSlicer image analysis software. Shape-based, intensity-based and texture-based features were extracted. An evolutionary algorithm (a TWIST system based on KNN algorithm) was used to subdivide the dataset into training and test set and select features yielding the maximal amount of information. After this pre-processing, a predictive model based on a training-testing crossover procedure was developed. The best neural network obtained was a 3-layer feed forward back propagation algorithm with 8 input variables containing the maximal amount of information.
Results: The neural network was used twice inverting the training/testing set. In the first analysis, the sensitivity was 100%, while the specificity, was 77.78%. These two results gave a global accuracy of 88.89%. In the second analysis the sensitivity was 61.54% and the specificity 100%, with a global accuracy of 80.77%. The mean value of the global accuracy was 84.83%.
Conclusion: The obtained results show that machine learning coupled with radiomics has a great potential in distinguishing responders with volume reduction from responders without volume reduction to radiosurgery, before the treatment.
Purpose: The quick, accurate and inexpensive translation of medical reports is a task of increasing importance in a globalised world with many patients crossing language borders during their treatment. Recently, deep learning-powered translation engines showed impressive results translating prose texts. The purpose of our study was to assess the utility of such engines in a radiological context.
Methods and Materials: 20 German radiology reports of oncologic follow-up examinations generated between March and September 2017 were randomly selected. The impression section was translated into English, French and Italian by two deep learning-powered translation engines (DeepL and Google Translate). The translations were then evaluated by three bilingual radiologists in the respective language (RH: English; NG: French; RW: Italian). Three error categories were labeled (stylistic error, content error without significance, potentially dangerous error).
Results: Overall, 79% of all words were translated correctly. However, only nine out of 120 translations were free of potentially dangerous errors and altogether 806 mistranslations were identified. Overall, the two translation engines showed a comparable performance. Of note, the quality of the translations into English was noticeably superior with 89% of all words correctly translated and 43% less potentially dangerous errors.
Conclusion: Deep learning-powered translation engines in their current version are not suitable for automated translation of complex oncologic radiology reports. However, considering the fact that the performance of neuronal networks depends heavily on the data they were trained on, training on medical texts holds an enormous potential for better results in the near future.
Purpose: To develop and validate deep learning-based algorithm pipeline for fast detection and localisation of skull fractures from non-contrast CT scans. All kinds of skull fractures: undisplaced, depressed, comminuted, etc. were included as part of study.
Methods and Materials: Anonymized and annotated dataset of 350 scans (11750 slices) with skull fractures were used for generating candidate proposals for fractures. Stacked network pipeline was used for candidate generation - a fully convolutional network for ROI generation and another deep convolutional network for ROI classification. Final ROI classification model (ResNet18) yielded fracture probabilities for candidates generated through the fully convolutional (UNet) network. Separate deep learning model was trained to detect haemorrhages on scan level which was used as proxy for clinical information. Fracture candidate features like size, probabilities, depth for top 5 most probable fracture candidates along with haemorrhage model confidence (phaemorrhage) were combined to train random forest classifier to detect fracture on scan level. In case of predicted fracture, most probable candidate(s) were used for localization.
Results: Separate set of 2971 scans, uniformly sampled from database with no exclusion criterion, was used for testing scan-level decisions with 108 scans reported as skull fracture cases. To evaluate scan-level decisions for fractures, area under receiver operating curve (AUC-ROC) was calculated as 0.83 with phaemorrhage as feature and 0.72 without. Free receiver operating curve yielded 0.9 sensitivity at 2.85 false-positives-per-scan. Predictions on each patient takes <30s.
Conclusion: Deep learning-based pipeline can accurately detect and localize skull fractures. Pipeline can be used for triaging patients for presence of skull fractures.
Purpose: We create a novel method for fast and accurate estimation of kidney volumes and signal intensity time courses in DCE-MRI, aiming at extracting both structural and functional quantitative information from the moving kidney.
Methods and Materials: Two repeated SPGR-DCE-MRI datasets were acquired from 20 healthy volunteers, resulting in 40 examinations, each consisting of 74 volumes recorded over ~6 min. We trained a 3D convolutional neural network (using a single standard NVIDIA GeForce 1080Ti GPU) for segmentation of left and right kidneys. The network has a dual-pathway architecture, incorporating both local and global information in the volumes. To create training data, we manually delineated 10 individual volumes from 10 different time-series, and extended the delineations to 740 volumes using image registration.
Results: Our implementation is able to segment all 74 volumes in a previously unseen, unregistered recording in less than 7 minutes. Mean segmentation accuracy (Dice) was 0.843 (SD=0.010). Mean (SD) left and right kidney volumes [ml] (incl. renal hilum) in one of the subjects (FF03) examined seven days apart (MR1 and MR2) was: MR1 L: 301.6 (15.9), R: 389.8 (16.9); MR2 L: 307.4 (17.8) R: 395.3 (23.6).
Conclusion: A CNN is able to quickly and accurately segment the moving kidneys in DCE-MRI, providing estimates of kidney volumes and mean signal intensity time courses. We are currently working to achieve sub-segmentation of the kidney (cortex, medulla, pelvis) and segmentation of the aorta (for AIF), enabling automated and fast estimation of GFR directly from the DCE-MRI.
Purpose: Intracranial carotid artery calcification (ICAC) is a major risk factor for stroke, and might contribute to dementia and cognitive decline. Further research into the relationship between ICAC and neurological diseases is much demanded, but hampered by the time-consuming manual annotation of ICAC lesions. Therefore, we introduce the first fully automatic ICAC lesion detection method.
Methods and Materials: Non-contrast-enhanced CT scans were performed in 1882 participants of the Rotterdam Study, a population-based cohort study (mean age 69.6(6.8), 51.7% female). Two trained observers annotated the scans by indicating regions of interest on the intracranial carotid artery track (from the horizontal petrous segment to the circle of Willis) where calcifications were visible. ICAC lesion segmentations were obtained from these annotations by thresholding at 130 HU. We developed a deep learning based algorithm to automatically delineate ICAC lesions in CT scans. The algorithm was trained on scans of 882 subjects and validated on 1,000 scans of other subjects.
Results: The intraclass correlation between ICAC volumes computed from manual and automatic ICAC segmentations was 97.7%. This is close to the interrater agreement of 99%. The Bland-Altman bias (95% CI) was 197(-384, 778). (Mean ICAC volume is 1151±1624.)
Conclusion: Our algorithm can be used to automate the time-consuming manual annotation of ICAC in large epidemiological studies, whilst maintaining a comparable level of accuracy. This can facilitate research into causes and consequences of ICAC, which might result in development of new treatments and establishment of ICAC volume as a stroke risk estimator in clinical practice.
Purpose: To evaluate a patch based deep learning approach for liver metastases detection that models the variability between liver metastases, normal liver parenchyma and normal liver boundary.
Methods and Materials: This research was supported by the ISRAEL SCIENCE FOUNDATION (grant No. 1918/16). In this study we evaluate a patch based Convolutional Neural Networks (CNN) approach for liver metastases detection on portal phase CT studies. Patches (~32X32 pixels/patch) are extracted from each liver and then fed into a multi-class CNN, which classifies the patches into three categories: (1) liver metastases, (2) normal liver parenchyma, (3) normal liver boundaries. The networks decisions between the three categories are then fused to reach a binary lesion versus non-lesion decision. Data augmentation was applied to enrich the number of patches (flipping, rotation). A senior radiologist segmented all liver and metastases boundaries. True positive rate (TPR) and false positive per case (FPC) were compared to a binary patch based CNN classifier (metastases vs. liver).
Results: The entire dataset included CT images of 132 livers with 498 2D segmented liver metastases. The CNN was trained using ~140,000 patches for each class. Evaluation was performed on the 132 livers with 2-fold cross validation. TPR and FPC were 85.9% and 1.9 for the multi-class CNN and 80.0% and 2.8 for the binary CNN.
Conclusion: The multi-class deep learning algorithm shows promising results in liver metastases detection task. Using prior knowledge of medical data, such as differences between interior and boundaries, may enhance CNN results.