If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
Joint Multimodal Deep Learning-based Automatic Segmentation of Indocyanine Green Angiography and OCT Images for Assessment of Polypoidal Choroidal Vasculopathy Biomarkers
Singapore Eye Research Institute, Singapore National Eye Center, Singapore, SingaporeDuke-NUS Medical School, National University of Singapore, Singapore, Singapore
Singapore Eye Research Institute, Singapore National Eye Center, Singapore, SingaporeDuke-NUS Medical School, National University of Singapore, Singapore, Singapore
Department of Biomedical Engineering, Duke University, Durham, North CarolinaDepartment of Ophthalmology, Duke University Medical Center, Durham, North Carolina
To develop a fully-automatic hybrid algorithm to jointly segment and quantify biomarkers of polypoidal choroidal vasculopathy (PCV) on indocyanine green angiography (ICGA) and spectral domain-OCT (SD-OCT) images.
Design
Evaluation of diagnostic test or technology.
Participants
Seventy-two participants with PCV enrolled in clinical studies at Singapore National Eye Center.
Methods
The dataset consisted of 2-dimensional (2-D) ICGA and 3-dimensional (3-D) SD-OCT images which were spatially registered and manually segmented by clinicians. A deep learning-based hybrid algorithm called PCV-Net was developed for automatic joint segmentation of biomarkers. The PCV-Net consisted of a 2-D segmentation branch for ICGA and 3-D segmentation branch for SD-OCT. We developed fusion attention modules to connect the 2-D and 3-D branches for effective use of the spatial correspondence between the imaging modalities by sharing learned features. We also used self-supervised pretraining and ensembling to further enhance the performance of the algorithm without the need for additional datasets. We compared the proposed PCV-Net to several alternative model variants.
Main Outcome Measures
The PCV-Net was evaluated based on the Dice similarity coefficient (DSC) of the segmentations and the Pearson’s correlation and absolute difference of the clinical measurements obtained from the segmentations. Manual grading was used as the gold standard.
Results
The PCV-Net showed good performance compared to manual grading and alternative model variants based on both quantitative and qualitative analyses. Compared to the baseline variant, PCV-Net improved the DSC by 0.04 to 0.43 across the different biomarkers, increased the correlations, and decreased the absolute differences of clinical measurements of interest. Specifically, the largest average (mean ± standard error) DSC improvement was for intraretinal fluid, from 0.02 ± 0.00 (baseline variant) to 0.45 ± 0.06 (PCV-Net). In general, improving trends were observed across the model variants as more technical specifications were added, demonstrating the importance of each aspect of the proposed method.
Conclusion
The PCV-Net has the potential to aid clinicians in disease assessment and research to improve clinical understanding and management of PCV.
Financial Disclosure(s)
Proprietary or commercial disclosure may be found after the references.
Indocyanine green angiography (ICGA) provides 2-dimensional (2-D) en face visualization of the choroidal vasculature and is the gold standard imaging method to diagnose PCV. The primary biomarkers visible on ICGA are polypoidal lesions and a branching vascular network (BVN).
OCT, through the composition of multiple cross-sectional B-scans, provides 3-dimensional (3-D) volumetric visualization of the retina and choroid on which corresponding and complementary biomarkers are visible. Specifically, polypoidal lesions on ICGA are associated with subretinal pigment epithelium (RPE) ring-like lesions on OCT, whereas the BVN on ICGA is associated with shallow irregular RPE elevation (SIRE) on OCT. Other biomarkers on OCT that are also useful for disease assessment include the retinal and choroidal thicknesses, intraretinal fluid (IRF), subretinal fluid (SRF), and pigment epithelium detachment (PED).
Polypoidal choroidal vasculopathy: consensus nomenclature and non–indocyanine green angiograph diagnostic criteria from the Asia-Pacific Ocular Imaging Society PCV Workgroup.
In addition to diagnosis, accurate assessment of biomarkers is important for the treatment of PCV. Similar to the case of typical neovascular AMD, treatment decisions in PCV are based on the presence or absence of activity biomarkers.
The use of vascular endothelial growth factor inhibitors and complementary treatment options in polypoidal choroidal vasculopathy: a subtype of neovascular age-related macular degeneration.
Retinal fluid is 1 of the key disease activity criteria and the different impact of retinal fluid compartments has been increasingly appreciated in both typical neovascular AMD and PCV.
Efficacy and safety of intravitreal aflibercept treat-and-extend regimens in the ALTAIR study: 96-week outcomes in the polypoidal choroidal vasculopathy subgroup.
Efficacy of a novel personalised aflibercept monotherapy regimen based on polypoidal lesion closure in participants with polypoidal choroidal vasculopathy.
In PCV, however, the persistence of polypoidal lesions and associated disease activity despite anti-VEGF therapy may call for a change in anti-VEGF agent or the addition of photodynamic therapy.
Efficacy and safety of ranibizumab with or without verteporfin photodynamic therapy for polypoidal choroidal vasculopathy: a randomized clinical trial.
The associated disease activity is often assessed based on the presence of IRF or SRF on OCT, and the treatment spot location and size for photodynamic therapy is guided by the area of polypoidal lesions and BVN on ICGA, although OCT-guided photodynamic therapy has recently been considered as well.
To date, the standard methods in clinical practice to assess PCV are based on manual image analysis and the binary classification of the absence or presence of the biomarkers.
Polypoidal choroidal vasculopathy: consensus nomenclature and non–indocyanine green angiograph diagnostic criteria from the Asia-Pacific Ocular Imaging Society PCV Workgroup.
Efficacy and safety of ranibizumab with or without verteporfin photodynamic therapy for polypoidal choroidal vasculopathy: a randomized clinical trial.
EVEREST study: efficacy and safety of verteporfin photodynamic therapy in combination with ranibizumab or alone versus ranibizumab monotherapy in patients with symptomatic macular polypoidal choroidal vasculopathy.
While detailed quantified measurements such as area or volume of biomarkers are expected to provide additional clinically useful information, the manual segmentation required for such precise quantification is not efficient in clinical practice.
However, these algorithms only segment a single biomarker and operate on a single imaging modality such as polypoidal lesions on ICGA or PED on OCT. On the other hand, a few semiautomatic algorithms that operate on multiple imaging modalities have been proposed for color fundus photography and OCT to classify subtypes of AMD.
However, manual input in the form of manual selection or manual annotation of the OCT B-scans is still required and the algorithms cannot provide quantified measurements of the respective biomarkers.
One of the main limitations of the existing algorithms is that the spatial correspondence between the features visible in the different imaging modalities is not effectively used. One challenge in exploiting the spatial correspondence between the imaging modalities is the different dimensionalities of the images, since ICGA and color fundus photography are 2-D, whereas OCT is 3-D. Existing algorithms circumvent this challenge in 1 of 2 ways. First, the algorithms operate only on 2-D images by selecting 1 B-scan per OCT volume at the expense of losing the information in all other B-scans.
In this article, we propose a deep learning-based hybrid algorithm for automatic joint multimodal segmentation of multiple biomarkers of PCV on ICGA and OCT images. We call this hybrid algorithm PCV-Net. The PCV-Net was developed and evaluated on images from 2 PCV clinical studies using manual segmentations as the gold standard. We developed fusion attention modules to effectively use the spatial correspondence between 2-D ICGA and 3-D OCT images without discarding information and to share learned features between the imaging modalities, resulting in improved performance overall. The algorithm can also provide automatic quantified measurements of PCV biomarkers. The PCV-Net has the potential to aid research progress in the field, such as in clinical studies investigating the clinical significance of biomarkers for to improve diagnosis and treatment, ultimately improving patient outcomes.
Methods
Dataset
The dataset consisted of images from the eyes of participants with PCV enrolled in 2 clinical studies (Phenotyping Asian AMD Study
Efficacy of a novel personalised aflibercept monotherapy regimen based on polypoidal lesion closure in participants with polypoidal choroidal vasculopathy.
) at the Singapore National Eye Center. The studies were approved by the institutional ethics board of SingHealth and adhered to the tenets of the Declaration of Helsinki. This is a retrospective study using de-identified subject details. Participants were imaged with ICGA and spectral domain (SD)-OCT on Spectralis systems (Heidelberg Engineering, GmBH) according to a previously reported standardized protocol.
The SD-OCT raster scans with enhanced depth imaging were acquired on a 30° × 20° (9 × 6 mm) macular region centered on the fovea, in the high-speed mode, with 25 B-scans per volume scan. Each B-scan was averaged using 9 frames in the Automatic Real Time Mean mode.
The Spectralis system also simultaneously acquires a near infrared (NIR) fundus image during SD-OCT imaging. Both ICGA and NIR provide 2-D en face images, whereas SD-OCT provides 3-D volumetric images.
Reading Center and Clinician Graders
All manual image analysis was performed by clinician graders (C.H.V., J.N.M.J-Y., and A.B.J.) from the Singapore National Eye Center Ocular Reading Center who had undergone modality-specific training for the assessment of AMD and PCV.
Spatial Registration
Spatial registration was performed by a clinician grader (C.H.V.) using custom software developed to register the ICGA image to the NIR image acquired during SD-OCT imaging.
Since the NIR and SD-OCT images were acquired simultaneously, this method effectively registered the ICGA to SD-OCT as well. Pairs of corresponding points were identified and selected at various locations on the ICGA and NIR images, usually at vessel intersections or bifurcations and over as wide an area as possible. The software estimated the geometric transformation by mapping the pairs of corresponding points between the images. The geometric transformation was parameterized by 8 unknowns and therefore a minimum of 4 corresponding points was required to estimate the geometric transformation. However, in most cases, > 4 corresponding points were identified and selected to improve the estimate. The accuracy of the estimated geometric transformation was determined by the vessel overlap between the images. Figure 1 shows the spatial registration process.
Figure 1Illustration of the spatial registration process. Pairs of corresponding points were identified and selected on the near infrared (NIR) image simultaneously acquired during spectral domain (SD)-OCT imaging and the indocyanine green angiography (ICGA) image. The accuracy of the registration was determined by the vessel overlap between the images. The red colored vessels correspond to the inner edges of the blood vessels on the NIR image while the green colored vessels correspond to outer edges of the blood vessels on the ICGA image. The difference between the type of edges segmented on the blood vessels is due to the type of contrast of the blood vessels in the images, i.e., blood vessels that appear darker than the background are segmented as a single inner edge, whereas blood vessels that appear brighter than the background are segmented as double outer edges. The yellow lines show the corresponding positions of the SD-OCT B-scans on the NIR and ICGA images.
Manual segmentation was performed by a clinician grader (C.H.V.) using a custom version of Duke OCT Retinal Analysis Program (DOCTRAP, Version 65.4.8).
The ICGA and SD-OCT images were displayed side-by-side to enable the clinician to easily refer to both imaging modalities during the segmentation process. The biomarkers segmented on ICGA were polypoidal lesions and BVN. The biomarkers segmented on SD-OCT were the retinal and choroidal layer boundaries, IRF, SRF, and sub-RPE ring-like lesions. Five retinal and choroidal layer boundaries were segmented, the inner limiting membrane (ILM), ellipsoid zone (EZ), RPE, Bruch’s membrane (BM), and choroidal-scleral interface (CSI). To better highlight SIRE and PED, the RPE was segmented only in areas where it was separated from the BM. Otherwise, the RPE shared the same segmentations as the BM to indicate normal RPE without separation. The location of the fovea was also annotated on SD-OCT. Figure 2 shows an example of the manual segmentations in DOCTRAP and conversion to segmentation labels.
Figure 2Manual segmentation of biomarkers was performed using the Duke OCT Retinal Analysis Program (DOCTRAP) and converted to segmentation labels. BM = Bruch’s membrane; BVN = branching vascular network; CSI = choroidal-scleral interface; EZ = ellipsoid zone; ICGA = indocyanine green angiography; ILM = inner limiting membrane; IRF = intraretinal fluid; RPE = retinal pigment epithelium; SD = spectral domain; SRF = subretinal fluid.
Ten-fold cross-validation was used to train and test the automatic segmentation algorithm on all available data to avoid selection bias and ensure independence between the training and testing sets. Participants were randomly divided into 10 groups of approximately equal size. Nine groups were designated as the training set, while the remaining group was designated as the testing set. The groups were then rotated such that each group was used once for testing. For validation, 1 group from the training set was designated as the validation set. As a result, there were 10 models, each trained using a different set of groups. During testing, for a given participant, the model that did not include the participant in its training set was used to generate the automatic segmentations. The same dataset splits were used for both self-supervised pretraining and training, which are described later.
Hybrid Network Architecture With Fusion Attention Modules
The PCV-Net consisted of a hybrid network architecture comprised of a 2-D ICGA segmentation branch and 3-D SD-OCT segmentation branch connected via fusion attention modules. Figure 3 shows the details of the hybrid network architecture.
Figure 3The hybrid network architecture of polypoidal choroidal vasculopathy-Net consisted of a 2-dimensional (2-D) indocyanine green angiography (ICGA) segmentation branch and 3-dimensional (3-D) spectral domain (SD)-OCT segmentation branch. Fusion attention modules connected the branches and enabled the sharing of features between the 2 branches for effective use of the spatial correspondence between the imaging modalities. The dimensionality of each block indicates the dimensionality of the output for that block i.e., 2-D or 3-D. ReLU = rectified linear unit.
U-net: convolutional networks for biomedical image segmentation.
in: International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer International Publishing,
Basel, Switzerland2015: 234-241
3D U-Net: learning dense volumetric segmentation from sparse annotation.
in: International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer International Publishing,
Basel, Switzerland2016: 424-432
Each segmentation branch consisted of 6 encoder blocks and 5 decoder blocks. The encoder blocks consisted of a max-pooling layer, a convolution layer, and a batch normalization layer followed by rectified linear unit activation, except for the first encoder block, which did not consist of a max-pooling layer. The decoder blocks consisted of a transposed convolution layer, followed by concatenation with the features from the corresponding encoder block of the same branch, a convolution layer, and a batch normalization layer followed by rectified linear unit activation. Finally, the output block consisted of a convolution layer followed by a softmax activation. The ICGA segmentation branch used 2-D operations, whereas the SD-OCT segmentation branch used 3-D operations.
Fusion attention modules were designed to connect the 2 segmentation branches into a single end-to-end hybrid network. The fusion attention modules were added to the end of each encoder block to enable the transformation and sharing of learned features between branches for effective use of the spatial correspondence between the imaging modalities. The fusion attention modules consisted of an attention mechanism
to selectively gate the important features to be shared with the other branch. To match the dimensionality of the features, dimensionality reduction (from 3-D to 2-D) or expansion (from 2-D to 3-D) was performed accordingly, followed by nearest neighbor interpolation. Then, the transformed features were concatenated with the features from the corresponding encoder block of the other branch. Dimensionality reduction was performed by taking the mean of values across the dimension, whereas dimensionality expansion was performed by tiling the values across the dimension. The ICGA fusion attention modules used 2-D operations, whereas the SD-OCT fusion attention modules used 3-D operations.
Self-supervised Pretraining
Due to the relatively large size of the hybrid network architecture and relatively small dataset, we first pretrained the network using a self-supervised approach based on image reconstruction.
Self-supervised pretraining enabled the network to learn about the structural content of the images and the pretrained weights provided a better initialization of the network during training of the segmentation task. While other supervised pretraining approaches were available, the advantage of self-supervised pretraining was that no external data or additional manual annotations were required.
For self-supervised pretraining, the output blocks of the hybrid network architecture were replaced with reconstruction blocks. Each reconstruction block consisted of a single convolution layer with 1 output channel and no activation. Briefly, image patches of different sizes were randomly removed from the images by setting the pixel intensities within the patches to 0. The network was trained to reconstruct the information in the missing patches based on the surrounding contextual information. For the ICGA images, 10 patches with a maximum size of 100 × 100 pixels were randomly removed whereas for the SD-OCT images, 50 patches with a maximum size of 100 × 100 × 10 pixels were randomly removed. Two forms of image augmentation were also randomly applied from the following intensity augmentations, adding a random scalar, multiplying by a random scalar, adding Gaussian noise, contrast normalization, applying Gaussian blur, or no augmentation. The weights of the network were randomly initialized using He initialization
to minimize a weighted L2 loss. The network was trained for 1000 epochs with a batch size of 1, learning rate of 10−5, momentum of 0.9, and L2 weight regularization applied with a factor of 1.0.
Class weights were applied to the loss. For the ICGA reconstruction branch, class weights of 5.0, 10.0, and 1.0 were applied to the polypoidal lesions, BVN, and background, respectively. For the SD-OCT reconstruction branch, class weights of 5.0, 10.0, 10.0, 5.0, 15.0, 10.0, 15.0, and 1.0 were applied to the ILM–EZ, EZ–RPE, RPE–BM, BM–CSI, IRF, SRF, sub-RPE ring-like lesions, and background, respectively. Additionally, a weight of 5.0 was added to pixels belonging to the missing patches. The number of epochs, learning rates, regularization factors, and class weights were empirically determined during initial experiments on the validation set.
Training
For training, the network was trained to segment the biomarkers in the images based on the manual segmentations of the training set. Image augmentation was performed as described in self-supervised pretraining. The weights of the network were initialized with the pretrained weights from self-supervised pretraining except for the output blocks, which were randomly initialized using He initialization.
The network was trained for 500 epochs with a batch size of 1, learning rate of 0.01, momentum of 0.9, and L2 weight regularization applied with a factor of 0.0001.
Class weights were applied to the loss as described in self-supervised pretraining. The number of epochs, learning rates, regularization factors, and class weights were empirically determined during initial experiments on the validation set.
Testing
For testing, the trained network was used to segment the biomarkers in the images of the testing set. No image augmentation was performed. The segmentations were enhanced in postprocessing using the following morphological operations.
For each class, a morphological closing operation with a circular or elliptical structuring element and binary filling of holes were applied. For ICGA, circular structuring elements with radius of 3 and 5 pixels were used for polypoidal lesions and BVN, respectively. For SD-OCT, elliptical structuring elements with sizes of 5 × 3, 3 × 1, 3 × 1, 5 × 3, 5 × 1, 5 × 1, and 1 × 5 pixels were used for the ILM–EZ, EZ–RPE, RPE–BM, BM–CSI, IRF, SRF, and sub-RPE ring-like lesions, respectively. For nonlayer biomarkers, regions smaller than 10 pixels were also removed. Additionally, to ensure that the segmentations adhered to the expected pathology and anatomy, several constraints were applied during postprocessing. Any instances of sub-RPE ring-like lesions not within the RPE–BM layer were deleted as sub-RPE ring-like lesions should only exist within the RPE–BM layer. Any instances of IRF or SRF within the RPE–CSI layers were deleted as IRF and SRF are less likely to exist within the RPE–CSI layers.
Ensembling
To improve segmentation accuracy, an ensembling approach was used during testing. Briefly, 3 rounds of self-supervised pretraining and training were performed to produce 3 trained networks. The segmentations from the 3 trained networks were ensembled based on majority voting and postprocessed as described above.
Comparison to Alternative Model Variants
The PCV-Net was compared to several alternative model variants to determine the effect of each aspect of the proposed method. First, as the baseline variant, we trained the ICGA and SD-OCT segmentation branches separately without the fusion attention modules. Therefore, this baseline variant was analogous to using a 2-D U-Net and 3-D U-Net to operate separately on the ICGA and SD-OCT images without using the spatial correspondence information between the imaging modalities, as in existing algorithms. Second, we trained the hybrid network with fusion attention modules using random initialization of all the weights of the network, instead of using the pretrained weights from self-supervised pretraining. Third, we trained the hybrid network with fusion attention modules and self-supervised pretraining as described above. For all 3 variants, we compared single and ensemble networks. Table 1 shows the technical specifications of the model variants. The model number indicates the variant described above and the letter S or E indicates the single or ensemble network, respectively.
Table 1Technical Specifications of the Proposed Method and Alternative Model Variants
Version
Baseline
Intermediate
Proposed PCV-Net
Model
1S
1E
2S
2E
3S
3E
Baseline
✓
✓
✓
✓
✓
✓
Fusion attention modules
✓
✓
✓
✓
Self-supervised pretraining
✓
✓
Ensembling
✓
✓
✓
E = ensemble; PCV = polypoidal choroidal vasculopathy; S = single.
Several clinical measurements of interest were obtained from the segmentations as defined in Table 2. All measurements were calculated within a 6 mm diameter concentric circle centered on the foveal center point as defined by the ETDRS.
Research Group ETDRS Grading diabetic retinopathy from stereoscopic color fundus photographs—an extension of the modified Airlie house classification: ETDRS report number 10.
Clinicians have qualitatively noted that SIRE, defined as elevated or undulating RPE observed on SD-OCT, has a strong correspondence to the BVN observed on ICGA.
Polypoidal choroidal vasculopathy: consensus nomenclature and non–indocyanine green angiograph diagnostic criteria from the Asia-Pacific Ocular Imaging Society PCV Workgroup.
However, a robust quantitative definition of SIRE has yet to be established. Furthermore, the feature defined as a separation of RPE and BM may also refer to PED.
These differences between SIRE and PED are intuitively obvious to clinicians but require strict quantitative definitions for automatic segmentation algorithms. While most clinical studies focus only on the maximum height of the PED,
The role of pigment epithelial detachment in AMD with submacular hemorrhage treated with vitrectomy and subretinal co-application of rtPA and anti-VEGF.
a quantitative definition of the minimum elevation required to be considered a PED is also important for other quantified measurements such as volume.
Therefore, we developed quantitative definitions of SIRE and PED by analyzing the correspondence between the manual segmentations of the BVN on ICGA and the RPE–BM layer on SD-OCT. Two criteria are necessary to define and distinguish SIRE and PED: (1) the RPE–BM layer thickness (a quantitative measure of elevation) and (2) the RPE layer roughness (a quantitative measure of the undulating surface). Retinal pigment epithelium–BM layer thickness was calculated by calculating the height difference between the segmentations of the RPE and BM layer boundaries. Retinal pigment epithelium layer roughness at a given point was calculated as follows:
where was the segmentation of the RPE layer boundary within a window of 400 μm length centered at the given point as shown in Figure 4, was the standard deviation operation, and was the first difference operation. Therefore, a smooth surface would have a roughness close to zero while a more undulating surface would have a higher roughness.
Figure 4Retinal pigment epithelium layer roughness at a given point (yellow dot) was calculated within a window of 400 μm length (yellow box).
Grid search was used to determine the optimum thresholds for thickness, roughness, and window length by calculating the overlap between the BVN on ICGA and SIRE (excluding PED) on SD-OCT based on the Dice similarity coefficient (DSC). More details of the DSC are available in Evaluation Metrics. Thickness and roughness thresholds were determined separately and then combined to obtain the SIRE on SD-OCT. Binary morphological operations
to fill holes and retain only the largest region were applied to remove small discontinuous regions of SIRE on SD-OCT and improve the correspondence analysis with the BVN on ICGA, which was typically segmented as a single continuous region. The values used for the grid search were within 0 to 30 μm for minimum thickness of SIRE, 100 to 200 μm for minimum thickness of PED, 1 to 3 μm for the minimum roughness of SIRE, and 300 to 500 μm for window length. Overall, the highest average (mean ± standard deviation, median) DSC achieved across all 72 participants was 0.61 ± 0.22, 0.68 at the thresholds given in Table 2, which were selected as the final thresholds.
Evaluation Metrics
Several evaluation metrics were calculated to evaluate the performance of PCV-Net. The DSC
was calculated to measure the overlap between the manual and automatic segmentations. The range of the DSC is from 0 to 1, whereby a higher value indicates better performance. The Pearson’s correlation, r, and absolute difference between the manual and automatic clinical measurements obtained from the segmentations were also calculated. The range of r is from 0 to 1, whereby a higher value indicates better performance. One popular convention is r ≥ 0.70 indicates high correlation, 0.50 ≤ r < 0.70 indicates moderate correlation, and r < 0.50 indicates low correlation.
On the other hand, a lower value indicates better performance for the absolute difference. The DSC and absolute difference were calculated for each eye and averaged across all eyes. All eyes were included in the evaluation metrics, even if the specific biomarker was absent in the particular eye, as it is important that an algorithm’s ability to correctly identify the absence of a biomarker is reflected in the evaluation metrics as well. Therefore, if an absent biomarker was correctly identified as absent by the algorithm, this resulted in a DSC of 1. On the other hand, if an absent biomarker was incorrectly identified as present by the algorithm, this resulted in a DSC of 0. Each evaluation metric measures a different aspect of performance and are complementary to each other. Therefore, it is important that they are considered together when evaluating the overall performance of an algorithm.
Implementation
The custom software for spatial registration and manual segmentation (DOCTRAP, Version 65.4.8) was developed in MATLAB (Version 9.5.0 R2018b).
The dataset consisted of 72 pairs of ICGA and SD-OCT images from 72 eyes of 72 participants at the baseline timepoint. All ICGA images and SD-OCT B-scans were resized using bilinear interpolation to a standard size of 512 × 512 pixels.
Quantitative Analysis
Table 3, Table 4, Table 5 show the average evaluation metrics of the proposed method and alternative model variants on all 72 participants. Table 3 shows the DSC. Tables 4 and 5 show the Pearson’s correlation and the absolute difference between the manual and automatic clinical measurements of interest obtained from the segmentations, respectively. For ease of visual comparison, Figures 5 and 6 show the radar chart of the DSC and correlation, respectively, whereby a larger area of the radar chart indicates better performance. Note that a radar chart of the absolute difference would not be as useful due to the different orders of magnitudes of the clinical measurements. Overall, the proposed PCV-Net outperformed the baseline variant on all evaluation metrics with higher DSCs, higher correlations, and lower absolute differences. In Table 3, the proposed PCV-Net had the highest DSCs for 6 of the 9 biomarkers, specifically the BVN, ILM–EZ, RPE–BM, BM–CSI, IRF, and SRF, and was at least the next best-performing variant for the other biomarkers. Similar results can be observed in Tables 4 and 5 whereby the proposed PCV-Net had the best performance for many clinical measurements of interest and was at least the next best performing variant in most of the other cases. In general, improving trends were observed across the model variants as more technical specifications were added, demonstrating the importance of each aspect of the proposed method. Scatterplots of the manual and automatic clinical measurements of interest are available in Online Supplementary Information 1 and an inter-reader analysis is available in Online Supplementary Information 2.
Table 3Average (Mean ± Standard Error, Median) DSC of the Proposed Method and Alternative Model Variants on All 72 Participants
Table 4Pearson’s Correlation, r, of the Manual and Automatic Clinical Measurements of Interest by the Proposed Method and Alternative Model Variants on all 72 Participants
Table 5Average (Mean ± Standard Error, Median) Absolute Difference of the Manual and Automatic Clinical Measurements of Interest by the Proposed Method and Alternative Model Variants on all 72 Participants
Figure 5Radar char of the mean Dice similarity coefficient of the proposed method and alternative model variants on all 72 participants. BM = Bruch’s membrane; BVN = branching vascular network; CSI = choroidal-scleral interface; E = ensemble; EZ = ellipsoid zone; ILM = inner limiting membrane; IRF = intraretinal fluid; RPE = retinal pigment epithelium; S = single; SD = spectral domain; SRF = subretinal fluid.
Figure 6Radar chart of the Pearson’s correlation, r of the manual and automatic clinical measurements of interest by the proposed method and alternative model variants on all 72 participants. BVN = branching vascular network; E = ensemble; EZ = ellipsoid zone; IRF = intraretinal fluid; PED = pigment epithelium detachment; RPE = retinal pigment epithelium; S = single; SD = spectral domain; SIRE = shallow irregular RPE elevation; SRF = subretinal fluid.
Figures 7 and 8 show examples of the manual segmentations by the 2 independent clinicians and automatic segmentations by the proposed method and alternative model variants on ICGA and SD-OCT images of PCV. More examples are available in Online Supplementary Information 3. Overall, the proposed PCV-Net produced the best automatic segmentations among the model variants as it had the highest similarity when compared to the manual segmentations by clinicians. While the alternative variants showed reasonable performance on images with minor pathology, PCV-Net outperformed the alternative variants on difficult images with severe pathology. In several challenging cases, there was poor agreement even between the manual segmentations by clinicians, highlighting the complexity and ambiguity of the pathology.
Figure 7Example segmentations by the proposed method and alternative model variants on indocyanine green angiography (ICGA). The spectral domain-OCT B-scans correspond to the positions marked by the horizontal lines on ICGA. In the first example (yellow), the algorithms incorrectly segmented a region of the branching vascular network (BVN). Analysis of the corresponding B-scan showed that there was subtle shallow irregular retinal pigment epithelium (RPE) elevation in the region, but they were likely not considered severe enough for the clinicians to associate with BVN. In the second example (orange), the clinicians and algorithms, except 1S, showed good agreement in their segmentations of the polypoidal lesion which was clearly visible on ICGA. Analysis of the corresponding B-scan showed that there was a sub-RPE ring-like lesion clearly visible (white arrow) as well. Overall, the clinicians had good agreement. Compared to the clinicians’ segmentations, the proposed method had the best agreement for polypoidal lesions and a slightly larger BVN area. E = ensemble; S = single.
Figure 8Example segmentations by the proposed method and alternative model variants on spectral domain (SD)-OCT. The SD-OCT B-scan corresponds to the position marked by the horizontal line on indocyanine green angiography (ICGA). 1S and 1E did not segment the sub-retinal pigment epithelium (RPE) ring-like lesion which was faintly visible (white arrow) on the B-scan. Analysis of the corresponding ICGA showed that there was a polypoidal lesion clearly visible which may explain why the models with fusion attention modules were able to identify the sub-RPE ring-like lesion on the B-scan more accurately. Overall, the clinicians had good agreement with some differences for ellipsoid zone (EZ)–RPE and the size of the sub-RPE ring-like lesion. Compared to the clinicians’ segmentations, the proposed method had the best agreement, especially for the sub-RPE ring-like lesion. BM = Bruch’s membrane; CSI = choroidal-scleral interface; E = ensemble; ILM = inner limiting membrane; IRF = intraretinal fluid; S = single; SRF = subretinal fluid.
We have proposed and developed PCV-Net, a deep learning-based hybrid algorithm for joint multimodal automatic segmentation of multiple PCV-biomarkers on ICGA and SD-OCT images. The PCV-Net consisted of a hybrid network architecture comprised of a 2-D ICGA segmentation branch and 3-D SD-OCT segmentation branch connected via fusion attention modules which we developed to effectively use the spatial correspondence between the imaging modalities.
Overall, the proposed method showed good performance compared to alternative model variants based on both quantitative and qualitative analyses when applied to images from 2 PCV clinical studies. Compared to the baseline variant, which was analogous to using a 2-D U-Net and 3-D U-Net to operate separately on the ICGA and SD-OCT images without using the spatial correspondence information between the imaging modalities, the proposed method improved the DSC by 0.04 to 0.43 across the different biomarkers, increased the correlations, and decreased the absolute differences of clinical measurements of interest. While most of the performance improvement could be attributed to the fusion attention modules, we also showed that the additional aspects of self-supervised pretraining and ensembling further boosted the algorithm’s performance. Although the quantitative improvements resulting from self-supervised pretraining and ensembling were smaller, large qualitative improvements were observed in the segmentations. For example, ensembling reduced regions of noisy segmentations and self-supervised pretraining improved the segmentations of some of the rarer biomarkers such as IRF and sub-RPE ring-like lesions.
The inter-reader analysis showed that the proposed PCV-Net has room for further improvement before it can achieve performance on par with manual segmentations by clinicians. In particular, there was a significant difference of a DSC of ≥ 0.20 between manual and automatic segmentations for polypoidal lesions and BVN on ICGA and sub-RPE ring-like lesions on SD-OCT, indicating the most important biomarkers to focus on for further improvement. Besides that, the inter-reader analysis also highlighted the difficulty of manual segmentations to obtain an accurate gold standard in such a complex disease. Sub-RPE ring-like lesions, polypoidal lesions, IRF, BVN, and EZ are notoriously difficult to segment even for experienced clinicians, and the independent clinicians (C.H.V. and J.N.M.J-Y./ C.H.V. and A.B.J.) only agreed 38% to 65% of the time for these biomarkers. Upon further analysis, some of the disagreements between the clinicians could be attributed to manual segmentation errors by 1 of the clinicians or differing opinions in ambiguous regions of the images. However, we note that differing opinions or differing approaches to segmenting these ambiguous regions may not necessarily have a significant impact on clinical outcome, as long as the opinions and approaches remain consistent over time, as has been shown in previous work.
As we continue to develop PCV-Net in future work, we will explore several steps to further improve the performance of the algorithm. First, we will focus on the collection of a larger dataset with more accurate manual segmentations to create a high-quality gold standard for training and evaluation. Many deep learning-based algorithms are developed with datasets on the order of thousands or even millions of examples. While it is certainly harder to curate such large datasets in the medical domain, we will consider alternative approaches such as combining or averaging the manual segmentations from multiple clinicians to improve the accuracy and robustness of the manual segmentations. We also aim to evaluate the generalizability of PCV-Net to data from other clinical studies and institutions, as we have done with our previous algorithms,
Validation of a deep learning-based algorithm for segmentation of the ellipsoid zone on optical coherence tomography images of an USH2A-related retinal degeneration clinical trial.
as the generalizability of an algorithm is an important aspect for clinical application. Additionally, we will also conduct intrareader analyses to provide more insight into the performance of the individual clinicians in repeated manual segmentations of the same images. Similarly, we will conduct longitudinal intrareader analyses to evaluate the consistency of both manual and automatic segmentations over time to provide a more clinically-relevant assessment of performance.
Besides that, performance gains may be achieved by increasing the size of the network or using an ensemble of a larger number of networks, albeit at the cost of increased training and testing times. Furthermore, while only simple postprocessing using morphological operations was used in PCV-Net, advanced postprocessing techniques
may also improve the quality of the automatic segmentations of the algorithm. We did not elect to use such advanced postprocessing techniques in PCV-Net as our focus was on highlighting the effectiveness of the hybrid network architecture with fusion attention modules.
We also note that the hybrid network architecture with fusion attention modules is not limited to assessing PCV and can easily be adapted for any application that requires the combination of 2-D and 3-D information. While the architectures of the 2-D and 3-D segmentation branches in the hybrid network were designed specifically for ICGA and SD-OCT in our application, the fusion attention modules can easily be used with different branch architectures as well.
In conclusion, we have proposed and developed PCV-Net, a deep learning-based hybrid algorithm for automatic joint multimodal segmentation of multiple biomarkers of PCV on ICGA and SD-OCT images. We have highlighted the importance of the effective use of the spatial correspondence between imaging modalities and developed fusion attention modules which can share learned features between imaging modalities of different dimensionalities without discarding information for overall improved performance. The PCV-Net can provide several clinical measurements of interest for important biomarkers, including SIRE and PED, for which we derived quantitative definitions, at a fraction of the time and cost compared to manual segmentations by clinicians. Therefore, there is great potential for the algorithm to aid clinicians in disease assessment. As the research progresses and the algorithms are further developed, we expect that it will lead to better understanding of the disease, ultimately improving clinical management of the disease and patient outcomes.
Polypoidal choroidal vasculopathy: consensus nomenclature and non–indocyanine green angiograph diagnostic criteria from the Asia-Pacific Ocular Imaging Society PCV Workgroup.
The use of vascular endothelial growth factor inhibitors and complementary treatment options in polypoidal choroidal vasculopathy: a subtype of neovascular age-related macular degeneration.
Efficacy and safety of intravitreal aflibercept treat-and-extend regimens in the ALTAIR study: 96-week outcomes in the polypoidal choroidal vasculopathy subgroup.
Efficacy of a novel personalised aflibercept monotherapy regimen based on polypoidal lesion closure in participants with polypoidal choroidal vasculopathy.
Efficacy and safety of ranibizumab with or without verteporfin photodynamic therapy for polypoidal choroidal vasculopathy: a randomized clinical trial.
EVEREST study: efficacy and safety of verteporfin photodynamic therapy in combination with ranibizumab or alone versus ranibizumab monotherapy in patients with symptomatic macular polypoidal choroidal vasculopathy.
U-net: convolutional networks for biomedical image segmentation.
in: International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer International Publishing,
Basel, Switzerland2015: 234-241
3D U-Net: learning dense volumetric segmentation from sparse annotation.
in: International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer International Publishing,
Basel, Switzerland2016: 424-432
Grading diabetic retinopathy from stereoscopic color fundus photographs—an extension of the modified Airlie house classification: ETDRS report number 10.
The role of pigment epithelial detachment in AMD with submacular hemorrhage treated with vitrectomy and subretinal co-application of rtPA and anti-VEGF.
Validation of a deep learning-based algorithm for segmentation of the ellipsoid zone on optical coherence tomography images of an USH2A-related retinal degeneration clinical trial.
National Institutes of Health (P30 EY005722), Research to Prevent Blindness (Unrestricted Grant to Duke University), Duke/Duke-NUS Research Collaboration Pilot Project Award (Duke/Duke-NUS/RECA(Pilot)/2019/0052), Singapore Open Fund (2018 Large Collaborative Grant).
HUMAN SUBJECTS: Human subjects were used in this study. The studies were approved by the institutional ethics board of SingHealth and adhered to the tenets of the Declaration of Helsinki. This is a retrospective study using de-identified subject details.
No animal subjects were used in this study.
Authors Contributions:
Conception and design: Loo, Teo, Jaffe, Cheung, Farsiu
Analysis and interpretation: Loo, Teo, Jaffe, Cheung, Farsiu
Data collection: Loo, Teo, Vyas, Jordan-Yu, Juhari, Jaffe, Cheung, Farsiu