Original Article| Volume 3, ISSUE 3, 100292, September 2023

Ok
• PDF [3 MB]PDF [3 MB]
• Top

# Joint Multimodal Deep Learning-based Automatic Segmentation of Indocyanine Green Angiography and OCT Images for Assessment of Polypoidal Choroidal Vasculopathy Biomarkers

Open AccessPublished:February 24, 2023

### Purpose

To develop a fully-automatic hybrid algorithm to jointly segment and quantify biomarkers of polypoidal choroidal vasculopathy (PCV) on indocyanine green angiography (ICGA) and spectral domain-OCT (SD-OCT) images.

### Design

Evaluation of diagnostic test or technology.

### Participants

Seventy-two participants with PCV enrolled in clinical studies at Singapore National Eye Center.

### Methods

The dataset consisted of 2-dimensional (2-D) ICGA and 3-dimensional (3-D) SD-OCT images which were spatially registered and manually segmented by clinicians. A deep learning-based hybrid algorithm called PCV-Net was developed for automatic joint segmentation of biomarkers. The PCV-Net consisted of a 2-D segmentation branch for ICGA and 3-D segmentation branch for SD-OCT. We developed fusion attention modules to connect the 2-D and 3-D branches for effective use of the spatial correspondence between the imaging modalities by sharing learned features. We also used self-supervised pretraining and ensembling to further enhance the performance of the algorithm without the need for additional datasets. We compared the proposed PCV-Net to several alternative model variants.

### Main Outcome Measures

The PCV-Net was evaluated based on the Dice similarity coefficient (DSC) of the segmentations and the Pearson’s correlation and absolute difference of the clinical measurements obtained from the segmentations. Manual grading was used as the gold standard.

### Results

The PCV-Net showed good performance compared to manual grading and alternative model variants based on both quantitative and qualitative analyses. Compared to the baseline variant, PCV-Net improved the DSC by 0.04 to 0.43 across the different biomarkers, increased the correlations, and decreased the absolute differences of clinical measurements of interest. Specifically, the largest average (mean ± standard error) DSC improvement was for intraretinal fluid, from 0.02 ± 0.00 (baseline variant) to 0.45 ± 0.06 (PCV-Net). In general, improving trends were observed across the model variants as more technical specifications were added, demonstrating the importance of each aspect of the proposed method.

### Conclusion

The PCV-Net has the potential to aid clinicians in disease assessment and research to improve clinical understanding and management of PCV.

### Financial Disclosure(s)

Proprietary or commercial disclosure may be found after the references.

## Keywords

#### Abbreviations and Acronyms:

2-D (2-dimensional), 3-D (3-dimensional), AMD (age-related macular degeneration), BM (Bruch’s membrane), BVN (branching vascular network), CSI (choroidal-scleral interface), DOCTRAP (Duke Optical Coherence Tomography Retinal Analysis Program), DSC (Dice similarity coefficient), EZ (ellipsoid zone), ICGA (indocyanine green angiography), ILM (inner limiting membrane), IRF (intraretinal fluid), NIR (near infrared), PCV (polypoidal choroidal vasculopathy), PED (pigment epithelium detachment), RPE (retinal pigment epithelium), SD (spectral domain), SIRE (shallow irregular RPE elevation), SRF (subretinal fluid)
Polypoidal choroidal vasculopathy (PCV) is a subtype of neovascular age-related macular degeneration (AMD).
• Ciardella A.P.
• Donsoff I.M.
• Huang S.J.
• et al.
Polypoidal choroidal vasculopathy.
,
• Cheung C.M.G.
• Lai T.Y.
• Ruamviboonsuk P.
• et al.
Polypoidal choroidal vasculopathy: definition, pathogenesis, diagnosis, and management.
Indocyanine green angiography (ICGA) provides 2-dimensional (2-D) en face visualization of the choroidal vasculature and is the gold standard imaging method to diagnose PCV. The primary biomarkers visible on ICGA are polypoidal lesions and a branching vascular network (BVN).
• Cheung C.M.G.
• Lai T.Y.
• Ruamviboonsuk P.
• et al.
Polypoidal choroidal vasculopathy: definition, pathogenesis, diagnosis, and management.
,
• Spaide R.F.
• Yannuzzi L.A.
• Slakter J.S.
• et al.
Indocyanine green videoangiography of idiopathic polypoidal choroidal vasculopathy.
Polypoidal lesions are nodular vascular agglomerations that suggest the appearance of aneurysms.
• Spaide R.F.
• Jaffe G.J.
• Sarraf D.
• et al.
Consensus nomenclature for reporting neovascular age-related macular degeneration data: consensus on neovascular age-related macular degeneration nomenclature study group.
OCT, through the composition of multiple cross-sectional B-scans, provides 3-dimensional (3-D) volumetric visualization of the retina and choroid on which corresponding and complementary biomarkers are visible. Specifically, polypoidal lesions on ICGA are associated with subretinal pigment epithelium (RPE) ring-like lesions on OCT, whereas the BVN on ICGA is associated with shallow irregular RPE elevation (SIRE) on OCT. Other biomarkers on OCT that are also useful for disease assessment include the retinal and choroidal thicknesses, intraretinal fluid (IRF), subretinal fluid (SRF), and pigment epithelium detachment (PED).
• Cheung C.M.G.
• Lai T.Y.
• Teo K.
• et al.
Polypoidal choroidal vasculopathy: consensus nomenclature and non–indocyanine green angiograph diagnostic criteria from the Asia-Pacific Ocular Imaging Society PCV Workgroup.
In addition to diagnosis, accurate assessment of biomarkers is important for the treatment of PCV. Similar to the case of typical neovascular AMD, treatment decisions in PCV are based on the presence or absence of activity biomarkers.
• Teo K.Y.C.
• Gillies M.
• Fraser-Bell S.
The use of vascular endothelial growth factor inhibitors and complementary treatment options in polypoidal choroidal vasculopathy: a subtype of neovascular age-related macular degeneration.
,
• Teo K.Y.C.
• Cheung C.M.G.
• et al.
Non-ICGA treatment criteria for suboptimal anti-VEGF response for polypoidal choroidal vasculopathy: APOIS PCV Workgroup Report 2.
Retinal fluid is 1 of the key disease activity criteria and the different impact of retinal fluid compartments has been increasingly appreciated in both typical neovascular AMD and PCV.
• Takahashi K.
• Ohji M.
• et al.
Efficacy and safety of intravitreal aflibercept treat-and-extend regimens in the ALTAIR study: 96-week outcomes in the polypoidal choroidal vasculopathy subgroup.
• Teo K.Y.C.
• Jordan-Yu J.M.
• Tan A.C.
• et al.
Efficacy of a novel personalised aflibercept monotherapy regimen based on polypoidal lesion closure in participants with polypoidal choroidal vasculopathy.
• Maruko I.
• Ogasawara M.
• Yamamoto A.
• et al.
Two-year outcomes of treat-and-extend intravitreal aflibercept for exudative age-related macular degeneration: a prospective study.
In PCV, however, the persistence of polypoidal lesions and associated disease activity despite anti-VEGF therapy may call for a change in anti-VEGF agent or the addition of photodynamic therapy.
• Cheung C.M.G.
• Lai T.Y.
• Ruamviboonsuk P.
• et al.
Polypoidal choroidal vasculopathy: definition, pathogenesis, diagnosis, and management.
,
• Gomi F.
• Oshima Y.
• Mori R.
• et al.
Initial versus delayed photodynamic therapy in combination with ranibizumab for treatment of polypoidal choroidal vasculopathy: the Fujisan study.
,
• Koh A.
• Lai T.Y.
• Takahashi K.
• et al.
Efficacy and safety of ranibizumab with or without verteporfin photodynamic therapy for polypoidal choroidal vasculopathy: a randomized clinical trial.
The associated disease activity is often assessed based on the presence of IRF or SRF on OCT, and the treatment spot location and size for photodynamic therapy is guided by the area of polypoidal lesions and BVN on ICGA, although OCT-guided photodynamic therapy has recently been considered as well.
• Teo K.Y.C.
• Cheung C.M.G.
• et al.
Non-ICGA treatment criteria for suboptimal anti-VEGF response for polypoidal choroidal vasculopathy: APOIS PCV Workgroup Report 2.
,
• Spaide R.F.
• Donsoff I.
• Lam D.L.
• et al.
Treatment of polypoidal choroidal vasculopathy with photodynamic therapy.
To date, the standard methods in clinical practice to assess PCV are based on manual image analysis and the binary classification of the absence or presence of the biomarkers.
• Cheung C.M.G.
• Lai T.Y.
• Teo K.
• et al.
Polypoidal choroidal vasculopathy: consensus nomenclature and non–indocyanine green angiograph diagnostic criteria from the Asia-Pacific Ocular Imaging Society PCV Workgroup.
,
• Teo K.Y.C.
• Cheung C.M.G.
• et al.
Non-ICGA treatment criteria for suboptimal anti-VEGF response for polypoidal choroidal vasculopathy: APOIS PCV Workgroup Report 2.
,
• Gomi F.
• Oshima Y.
• Mori R.
• et al.
Initial versus delayed photodynamic therapy in combination with ranibizumab for treatment of polypoidal choroidal vasculopathy: the Fujisan study.
,
• Koh A.
• Lai T.Y.
• Takahashi K.
• et al.
Efficacy and safety of ranibizumab with or without verteporfin photodynamic therapy for polypoidal choroidal vasculopathy: a randomized clinical trial.
,
• Koh A.
• Lee W.K.
• Chen L.-J.
• et al.
EVEREST study: efficacy and safety of verteporfin photodynamic therapy in combination with ranibizumab or alone versus ranibizumab monotherapy in patients with symptomatic macular polypoidal choroidal vasculopathy.
• Lee W.K.
• Iida T.
• Ogura Y.
• et al.
Efficacy and safety of intravitreal aflibercept for polypoidal choroidal vasculopathy in the PLANET study: a randomized clinical trial.
• Oishi A.
• Kojima H.
• Mandai M.
• et al.
Comparison of the effect of ranibizumab and verteporfin for polypoidal choroidal vasculopathy: 12-month LAPTOP study results.
While detailed quantified measurements such as area or volume of biomarkers are expected to provide additional clinically useful information, the manual segmentation required for such precise quantification is not efficient in clinical practice.
• Loo J.
• Woodward M.A.
• Prajna V.
• et al.
Open-source automatic biomarker measurement on slit-lamp photography to estimate visual acuity in microbial keratitis.
• Kim D.Y.
• Loo J.
• Farsiu S.
• Jaffe G.J.
Comparison of single drusen size on color fundus photography and spectral-domain optical coherence tomography.
• Ferris III, F.L.
• Wilkinson C.
• Bird A.
• et al.
Clinical classification of age-related macular degeneration.
Automatic image analysis algorithms can help fulfill this unmet need and have been developed to help clinicians assess many ophthalmic conditions.
• Schmidt-Erfurth U.
• Gerendas B.S.
• et al.
Artificial intelligence in retina.
• Wang Z.
• Keane P.A.
• Chiang M.
• et al.
Artificial intelligence and deep learning in ophthalmology.
• Rasti R.
• Allingham M.J.
• Mettu P.S.
• et al.
Deep learning-based single-shot prediction of differential effects of anti-VEGF treatment in patients with diabetic macular edema.
For PCV, automatic image analysis algorithms have been developed to segment certain biomarkers.
• Lin W.-Y.
• Yang S.-C.
• Chen S.-J.
• et al.
Automatic segmentation of polypoidal choroidal vasculopathy from indocyanine green angiography using spatial and temporal patterns.
,
• Xu Y.
• Yan K.
• Kim J.
• et al.
Dual-stage deep learning framework for pigment epithelium detachment segmentation in polypoidal choroidal vasculopathy.
However, these algorithms only segment a single biomarker and operate on a single imaging modality such as polypoidal lesions on ICGA or PED on OCT. On the other hand, a few semiautomatic algorithms that operate on multiple imaging modalities have been proposed for color fundus photography and OCT to classify subtypes of AMD.
• Xu Z.
• Wang W.
• Yang J.
• et al.
Automated diagnoses of age-related macular degeneration and polypoidal choroidal vasculopathy using bi-modal deep convolutional neural networks.
,
• Chou Y.-B.
• Hsu C.-H.
• Chen W.-S.
• et al.
Deep learning and ensemble stacking technique for differentiating polypoidal choroidal vasculopathy from neovascular age-related macular degeneration.
However, manual input in the form of manual selection or manual annotation of the OCT B-scans is still required and the algorithms cannot provide quantified measurements of the respective biomarkers.
One of the main limitations of the existing algorithms is that the spatial correspondence between the features visible in the different imaging modalities is not effectively used. One challenge in exploiting the spatial correspondence between the imaging modalities is the different dimensionalities of the images, since ICGA and color fundus photography are 2-D, whereas OCT is 3-D. Existing algorithms circumvent this challenge in 1 of 2 ways. First, the algorithms operate only on 2-D images by selecting 1 B-scan per OCT volume at the expense of losing the information in all other B-scans.
• Xu Z.
• Wang W.
• Yang J.
• et al.
Automated diagnoses of age-related macular degeneration and polypoidal choroidal vasculopathy using bi-modal deep convolutional neural networks.
Second, the algorithms operate separately on the 2-D and 3-D images at the expense of losing important spatial correspondence information.
• Chou Y.-B.
• Hsu C.-H.
• Chen W.-S.
• et al.
Deep learning and ensemble stacking technique for differentiating polypoidal choroidal vasculopathy from neovascular age-related macular degeneration.
In this article, we propose a deep learning-based hybrid algorithm for automatic joint multimodal segmentation of multiple biomarkers of PCV on ICGA and OCT images. We call this hybrid algorithm PCV-Net. The PCV-Net was developed and evaluated on images from 2 PCV clinical studies using manual segmentations as the gold standard. We developed fusion attention modules to effectively use the spatial correspondence between 2-D ICGA and 3-D OCT images without discarding information and to share learned features between the imaging modalities, resulting in improved performance overall. The algorithm can also provide automatic quantified measurements of PCV biomarkers. The PCV-Net has the potential to aid research progress in the field, such as in clinical studies investigating the clinical significance of biomarkers for to improve diagnosis and treatment, ultimately improving patient outcomes.

## Methods

### Dataset

The dataset consisted of images from the eyes of participants with PCV enrolled in 2 clinical studies (Phenotyping Asian AMD Study
• Cheung C.M.G.
• Bhargava M.
• Laude A.
• et al.
Asian age-related macular degeneration phenotyping study: rationale, design and protocol of a prospective cohort study.
and An Open-Label Study to Compare the Efficacy of Aflibercept Monotherapy for PCV [NCT03117634]
• Teo K.Y.C.
• Jordan-Yu J.M.
• Tan A.C.
• et al.
Efficacy of a novel personalised aflibercept monotherapy regimen based on polypoidal lesion closure in participants with polypoidal choroidal vasculopathy.
) at the Singapore National Eye Center. The studies were approved by the institutional ethics board of SingHealth and adhered to the tenets of the Declaration of Helsinki. This is a retrospective study using de-identified subject details. Participants were imaged with ICGA and spectral domain (SD)-OCT on Spectralis systems (Heidelberg Engineering, GmBH) according to a previously reported standardized protocol.
• Cheung C.M.G.
• Bhargava M.
• Laude A.
• et al.
Asian age-related macular degeneration phenotyping study: rationale, design and protocol of a prospective cohort study.
The SD-OCT raster scans with enhanced depth imaging were acquired on a 30° × 20° (9 × 6 mm) macular region centered on the fovea, in the high-speed mode, with 25 B-scans per volume scan. Each B-scan was averaged using 9 frames in the Automatic Real Time Mean mode.
• Tan A.
• Jordan-Yu J.M.
• Vyas C.H.
• et al.
Optical coherence tomography features OF polypoidal lesion closure in polypoidal choroidal vasculopathy treated with aflibercept.
,
• Vyas C.H.
• Cheung C.M.G.
• Jordan-Yu J.M.N.
• et al.
Novel volumetric imaging biomarkers for assessing disease activity in eyes with PCV.
The Spectralis system also simultaneously acquires a near infrared (NIR) fundus image during SD-OCT imaging. Both ICGA and NIR provide 2-D en face images, whereas SD-OCT provides 3-D volumetric images.

All manual image analysis was performed by clinician graders (C.H.V., J.N.M.J-Y., and A.B.J.) from the Singapore National Eye Center Ocular Reading Center who had undergone modality-specific training for the assessment of AMD and PCV.

### Spatial Registration

Spatial registration was performed by a clinician grader (C.H.V.) using custom software developed to register the ICGA image to the NIR image acquired during SD-OCT imaging.
• Kim D.Y.
• Loo J.
• Farsiu S.
• Jaffe G.J.
Comparison of single drusen size on color fundus photography and spectral-domain optical coherence tomography.
,
• Mukherjee D.
• Vann R.R.
• et al.
Correlation between macular integrity assessment and optical coherence tomography imaging of ellipsoid zone in macular telangiectasia type 2.
Since the NIR and SD-OCT images were acquired simultaneously, this method effectively registered the ICGA to SD-OCT as well. Pairs of corresponding points were identified and selected at various locations on the ICGA and NIR images, usually at vessel intersections or bifurcations and over as wide an area as possible. The software estimated the geometric transformation by mapping the pairs of corresponding points between the images. The geometric transformation was parameterized by 8 unknowns and therefore a minimum of 4 corresponding points was required to estimate the geometric transformation. However, in most cases, > 4 corresponding points were identified and selected to improve the estimate. The accuracy of the estimated geometric transformation was determined by the vessel overlap between the images. Figure 1 shows the spatial registration process.

### Manual Segmentation

Manual segmentation was performed by a clinician grader (C.H.V.) using a custom version of Duke OCT Retinal Analysis Program (DOCTRAP, Version 65.4.8).
• Chiu S.J.
• Li X.T.
• Nicholas P.
• et al.
Automatic segmentation of seven retinal layers in SDOCT images congruent with expert manual segmentation.
,
• Chiu S.J.
• Izatt J.A.
• O'Connell R.V.
• et al.
Validated automatic segmentation of AMD pathology including drusen and geographic atrophy in SD-OCT images.
The ICGA and SD-OCT images were displayed side-by-side to enable the clinician to easily refer to both imaging modalities during the segmentation process. The biomarkers segmented on ICGA were polypoidal lesions and BVN. The biomarkers segmented on SD-OCT were the retinal and choroidal layer boundaries, IRF, SRF, and sub-RPE ring-like lesions. Five retinal and choroidal layer boundaries were segmented, the inner limiting membrane (ILM), ellipsoid zone (EZ), RPE, Bruch’s membrane (BM), and choroidal-scleral interface (CSI). To better highlight SIRE and PED, the RPE was segmented only in areas where it was separated from the BM. Otherwise, the RPE shared the same segmentations as the BM to indicate normal RPE without separation. The location of the fovea was also annotated on SD-OCT. Figure 2 shows an example of the manual segmentations in DOCTRAP and conversion to segmentation labels.

### Automatic Segmentation

#### Cross-Validation

Ten-fold cross-validation was used to train and test the automatic segmentation algorithm on all available data to avoid selection bias and ensure independence between the training and testing sets. Participants were randomly divided into 10 groups of approximately equal size. Nine groups were designated as the training set, while the remaining group was designated as the testing set. The groups were then rotated such that each group was used once for testing. For validation, 1 group from the training set was designated as the validation set. As a result, there were 10 models, each trained using a different set of groups. During testing, for a given participant, the model that did not include the participant in its training set was used to generate the automatic segmentations. The same dataset splits were used for both self-supervised pretraining and training, which are described later.

#### Hybrid Network Architecture With Fusion Attention Modules

The PCV-Net consisted of a hybrid network architecture comprised of a 2-D ICGA segmentation branch and 3-D SD-OCT segmentation branch connected via fusion attention modules. Figure 3 shows the details of the hybrid network architecture.
Briefly, the 2-D ICGA segmentation branch was designed based on a modified version of 2-D U-Net
• Ronneberger O.
• Fischer P.
• Brox T.
U-net: convolutional networks for biomedical image segmentation.
and the 3-D SD-OCT segmentation branch was designed based on a modified version of 3-D U-Net.
• Çiçek Ö.
• Lienkamp S.S.
• et al.
3D U-Net: learning dense volumetric segmentation from sparse annotation.
Each segmentation branch consisted of 6 encoder blocks and 5 decoder blocks. The encoder blocks consisted of a max-pooling layer, a convolution layer, and a batch normalization layer followed by rectified linear unit activation, except for the first encoder block, which did not consist of a max-pooling layer. The decoder blocks consisted of a transposed convolution layer, followed by concatenation with the features from the corresponding encoder block of the same branch, a convolution layer, and a batch normalization layer followed by rectified linear unit activation. Finally, the output block consisted of a convolution layer followed by a softmax activation. The ICGA segmentation branch used 2-D operations, whereas the SD-OCT segmentation branch used 3-D operations.
Fusion attention modules were designed to connect the 2 segmentation branches into a single end-to-end hybrid network. The fusion attention modules were added to the end of each encoder block to enable the transformation and sharing of learned features between branches for effective use of the spatial correspondence between the imaging modalities. The fusion attention modules consisted of an attention mechanism
• Oktay O.
• Schlemper J.
• Folgoc L.L.
• et al.
Attention u-net: learning where to look for the pancreas.
to selectively gate the important features to be shared with the other branch. To match the dimensionality of the features, dimensionality reduction (from 3-D to 2-D) or expansion (from 2-D to 3-D) was performed accordingly, followed by nearest neighbor interpolation. Then, the transformed features were concatenated with the features from the corresponding encoder block of the other branch. Dimensionality reduction was performed by taking the mean of values across the dimension, whereas dimensionality expansion was performed by tiling the values across the dimension. The ICGA fusion attention modules used 2-D operations, whereas the SD-OCT fusion attention modules used 3-D operations.

#### Self-supervised Pretraining

Due to the relatively large size of the hybrid network architecture and relatively small dataset, we first pretrained the network using a self-supervised approach based on image reconstruction.
• Jing L.
• Tian Y.
Self-supervised visual feature learning with deep neural networks: a survey.
,
• Pathak D.
• Krahenbuhl P.
• Donahue J.
• et al.
Context encoders: feature learning by inpainting.
Self-supervised pretraining enabled the network to learn about the structural content of the images and the pretrained weights provided a better initialization of the network during training of the segmentation task. While other supervised pretraining approaches were available, the advantage of self-supervised pretraining was that no external data or additional manual annotations were required.
For self-supervised pretraining, the output blocks of the hybrid network architecture were replaced with reconstruction blocks. Each reconstruction block consisted of a single convolution layer with 1 output channel and no activation. Briefly, image patches of different sizes were randomly removed from the images by setting the pixel intensities within the patches to 0. The network was trained to reconstruct the information in the missing patches based on the surrounding contextual information. For the ICGA images, 10 patches with a maximum size of 100 × 100 pixels were randomly removed whereas for the SD-OCT images, 50 patches with a maximum size of 100 × 100 × 10 pixels were randomly removed. Two forms of image augmentation were also randomly applied from the following intensity augmentations, adding a random scalar, multiplying by a random scalar, adding Gaussian noise, contrast normalization, applying Gaussian blur, or no augmentation. The weights of the network were randomly initialized using He initialization
• He K.
• Zhang X.
• Ren S.
• Sun J.
Delving deep into rectifiers: surpassing human-level performance on imagenet classification.
and optimized using Nesterov’s accelerated stochastic gradient descent
• Sutskever I.
• Martens J.
• Dahl G.
• Hinton G.
On the importance of initialization and momentum in deep learning.
to minimize a weighted L2 loss. The network was trained for 1000 epochs with a batch size of 1, learning rate of 10−5, momentum of 0.9, and L2 weight regularization applied with a factor of 1.0.
Class weights were applied to the loss. For the ICGA reconstruction branch, class weights of 5.0, 10.0, and 1.0 were applied to the polypoidal lesions, BVN, and background, respectively. For the SD-OCT reconstruction branch, class weights of 5.0, 10.0, 10.0, 5.0, 15.0, 10.0, 15.0, and 1.0 were applied to the ILM–EZ, EZ–RPE, RPE–BM, BM–CSI, IRF, SRF, sub-RPE ring-like lesions, and background, respectively. Additionally, a weight of 5.0 was added to pixels belonging to the missing patches. The number of epochs, learning rates, regularization factors, and class weights were empirically determined during initial experiments on the validation set.

#### Training

For training, the network was trained to segment the biomarkers in the images based on the manual segmentations of the training set. Image augmentation was performed as described in self-supervised pretraining. The weights of the network were initialized with the pretrained weights from self-supervised pretraining except for the output blocks, which were randomly initialized using He initialization.
• He K.
• Zhang X.
• Ren S.
• Sun J.
Delving deep into rectifiers: surpassing human-level performance on imagenet classification.
The weights of the network were optimized using Nesterov’s accelerated stochastic gradient descent
• Sutskever I.
• Martens J.
• Dahl G.
• Hinton G.
On the importance of initialization and momentum in deep learning.
to minimize a combination of a pixel-wise cross entropy loss and Dice loss.
• Milletari F.
• Navab N.
V-net: fully convolutional neural networks for volumetric medical image segmentation.
The network was trained for 500 epochs with a batch size of 1, learning rate of 0.01, momentum of 0.9, and L2 weight regularization applied with a factor of 0.0001.
Class weights were applied to the loss as described in self-supervised pretraining. The number of epochs, learning rates, regularization factors, and class weights were empirically determined during initial experiments on the validation set.

#### Testing

For testing, the trained network was used to segment the biomarkers in the images of the testing set. No image augmentation was performed. The segmentations were enhanced in postprocessing using the following morphological operations.
• Soille P.
Morphological Image Analysis: Principles and Applications.
For each class, a morphological closing operation with a circular or elliptical structuring element and binary filling of holes were applied. For ICGA, circular structuring elements with radius of 3 and 5 pixels were used for polypoidal lesions and BVN, respectively. For SD-OCT, elliptical structuring elements with sizes of 5 × 3, 3 × 1, 3 × 1, 5 × 3, 5 × 1, 5 × 1, and 1 × 5 pixels were used for the ILM–EZ, EZ–RPE, RPE–BM, BM–CSI, IRF, SRF, and sub-RPE ring-like lesions, respectively. For nonlayer biomarkers, regions smaller than 10 pixels were also removed. Additionally, to ensure that the segmentations adhered to the expected pathology and anatomy, several constraints were applied during postprocessing. Any instances of sub-RPE ring-like lesions not within the RPE–BM layer were deleted as sub-RPE ring-like lesions should only exist within the RPE–BM layer. Any instances of IRF or SRF within the RPE–CSI layers were deleted as IRF and SRF are less likely to exist within the RPE–CSI layers.

#### Ensembling

To improve segmentation accuracy, an ensembling approach was used during testing. Briefly, 3 rounds of self-supervised pretraining and training were performed to produce 3 trained networks. The segmentations from the 3 trained networks were ensembled based on majority voting and postprocessed as described above.

### Comparison to Alternative Model Variants

The PCV-Net was compared to several alternative model variants to determine the effect of each aspect of the proposed method. First, as the baseline variant, we trained the ICGA and SD-OCT segmentation branches separately without the fusion attention modules. Therefore, this baseline variant was analogous to using a 2-D U-Net and 3-D U-Net to operate separately on the ICGA and SD-OCT images without using the spatial correspondence information between the imaging modalities, as in existing algorithms. Second, we trained the hybrid network with fusion attention modules using random initialization of all the weights of the network, instead of using the pretrained weights from self-supervised pretraining. Third, we trained the hybrid network with fusion attention modules and self-supervised pretraining as described above. For all 3 variants, we compared single and ensemble networks. Table 1 shows the technical specifications of the model variants. The model number indicates the variant described above and the letter S or E indicates the single or ensemble network, respectively.
Table 1Technical Specifications of the Proposed Method and Alternative Model Variants
VersionBaselineIntermediateProposed PCV-Net
Model1S1E2S2E3S3E
Baseline
Fusion attention modules
Self-supervised pretraining
Ensembling
E = ensemble; PCV = polypoidal choroidal vasculopathy; S = single.

### Clinical Measurements

Several clinical measurements of interest were obtained from the segmentations as defined in Table 2. All measurements were calculated within a 6 mm diameter concentric circle centered on the foveal center point as defined by the ETDRS.
Research Group ETDRS
Grading diabetic retinopathy from stereoscopic color fundus photographs—an extension of the modified Airlie house classification: ETDRS report number 10.
Table 2Definitions of Clinical Measurements of Interest Obtained From the Segmentations
Imaging ModalityBiomarkerDefinitionMeasurements of Interest
ICGAPolypoidal lesionsAs segmentedArea
BVNAs segmentedArea
SD-OCTRetinaILM–BM layersVolume, average height
ChoroidBM–CSI layersVolume, average height
SIRERPE–BM layer thickness > 15 μm or RPE layer roughness > 2 μm (within window of 400 μm length)Volume, average height
PEDRPE–BM layer thickness > 150 μmVolume, maximum height
EZ defectsAbsence of EZArea
IRFAs segmentedVolume, average height
SRFAs segmentedVolume, average height
Sub-RPE ring-like lesionsAs segmentedVolume, average height
BM = Bruch’s membrane; BVN = branching vascular network; CSI = choroidal-scleral interface; EZ = ellipsoid zone; ICGA = indocyanine green angiography; ILM = inner limiting membrane; IRF = intraretinal fluid; PCV = polypoidal choroidal vasculopathy; PED = pigment epithelium detachment; RPE = retinal pigment epithelium; SD = spectral domain; SIRE = shallow irregular RPE elevation; SRF = subretinal fluid.
Clinicians have qualitatively noted that SIRE, defined as elevated or undulating RPE observed on SD-OCT, has a strong correspondence to the BVN observed on ICGA.
• Cheung C.M.G.
• Lai T.Y.
• Teo K.
• et al.
Polypoidal choroidal vasculopathy: consensus nomenclature and non–indocyanine green angiograph diagnostic criteria from the Asia-Pacific Ocular Imaging Society PCV Workgroup.
However, a robust quantitative definition of SIRE has yet to be established. Furthermore, the feature defined as a separation of RPE and BM may also refer to PED.
• Zayit-Soudry S.
• Moroz I.
• Loewenstein A.
Retinal pigment epithelial detachment.
,
• Tsujikawa A.
• Sasahara M.
• Otani A.
• et al.
Pigment epithelial detachment in polypoidal choroidal vasculopathy.
These differences between SIRE and PED are intuitively obvious to clinicians but require strict quantitative definitions for automatic segmentation algorithms. While most clinical studies focus only on the maximum height of the PED,
• Doguizi S.
• Ozdek S.
Pigment epithelial tears associated with anti-VEGF therapy: incidence, long-term visual outcome, and relationship with pigment epithelial detachment in age-related macular degeneration.
• Treumer F.
• Wienand S.
• Purtskhvanidze K.
• et al.
The role of pigment epithelial detachment in AMD with submacular hemorrhage treated with vitrectomy and subretinal co-application of rtPA and anti-VEGF.
• Major Jr., J.C.
• Wykoff C.C.
• Croft D.E.
• et al.
Aflibercept for pigment epithelial detachment for previously treated neovascular age-related macular degeneration.
a quantitative definition of the minimum elevation required to be considered a PED is also important for other quantified measurements such as volume.
Therefore, we developed quantitative definitions of SIRE and PED by analyzing the correspondence between the manual segmentations of the BVN on ICGA and the RPE–BM layer on SD-OCT. Two criteria are necessary to define and distinguish SIRE and PED: (1) the RPE–BM layer thickness (a quantitative measure of elevation) and (2) the RPE layer roughness (a quantitative measure of the undulating surface). Retinal pigment epithelium–BM layer thickness was calculated by calculating the height difference between the segmentations of the RPE and BM layer boundaries. Retinal pigment epithelium layer roughness at a given point was calculated as follows:
$roughness=sd(diff(x))$

where $x$ was the segmentation of the RPE layer boundary within a window of 400 μm length centered at the given point as shown in Figure 4, $sd(∙)$ was the standard deviation operation, and $diff(∙)$ was the first difference operation. Therefore, a smooth surface would have a roughness close to zero while a more undulating surface would have a higher roughness.
Grid search was used to determine the optimum thresholds for thickness, roughness, and window length by calculating the overlap between the BVN on ICGA and SIRE (excluding PED) on SD-OCT based on the Dice similarity coefficient (DSC). More details of the DSC are available in Evaluation Metrics. Thickness and roughness thresholds were determined separately and then combined to obtain the SIRE on SD-OCT. Binary morphological operations
• Soille P.
Morphological Image Analysis: Principles and Applications.
to fill holes and retain only the largest region were applied to remove small discontinuous regions of SIRE on SD-OCT and improve the correspondence analysis with the BVN on ICGA, which was typically segmented as a single continuous region. The values used for the grid search were within 0 to 30 μm for minimum thickness of SIRE, 100 to 200 μm for minimum thickness of PED, 1 to 3 μm for the minimum roughness of SIRE, and 300 to 500 μm for window length. Overall, the highest average (mean ± standard deviation, median) DSC achieved across all 72 participants was 0.61 ± 0.22, 0.68 at the thresholds given in Table 2, which were selected as the final thresholds.

### Evaluation Metrics

Several evaluation metrics were calculated to evaluate the performance of PCV-Net. The DSC
• Dice L.R.
Measures of the amount of ecologic association between species.
was calculated to measure the overlap between the manual and automatic segmentations. The range of the DSC is from 0 to 1, whereby a higher value indicates better performance. The Pearson’s correlation, r, and absolute difference between the manual and automatic clinical measurements obtained from the segmentations were also calculated. The range of r is from 0 to 1, whereby a higher value indicates better performance. One popular convention is r ≥ 0.70 indicates high correlation, 0.50 ≤ r < 0.70 indicates moderate correlation, and r < 0.50 indicates low correlation.
• Mukaka M.M.
A guide to appropriate use of correlation coefficient in medical research.
On the other hand, a lower value indicates better performance for the absolute difference. The DSC and absolute difference were calculated for each eye and averaged across all eyes. All eyes were included in the evaluation metrics, even if the specific biomarker was absent in the particular eye, as it is important that an algorithm’s ability to correctly identify the absence of a biomarker is reflected in the evaluation metrics as well. Therefore, if an absent biomarker was correctly identified as absent by the algorithm, this resulted in a DSC of 1. On the other hand, if an absent biomarker was incorrectly identified as present by the algorithm, this resulted in a DSC of 0. Each evaluation metric measures a different aspect of performance and are complementary to each other. Therefore, it is important that they are considered together when evaluating the overall performance of an algorithm.

### Implementation

The custom software for spatial registration and manual segmentation (DOCTRAP, Version 65.4.8) was developed in MATLAB (Version 9.5.0 R2018b).
The PCV-Net was developed in Python using the TensorFlow (Version 1.5.1) library.
• Barham P.
• Chen J.
• et al.
TensorFlow: a system for large-scale machine learning.

## Results

### Dataset

The dataset consisted of 72 pairs of ICGA and SD-OCT images from 72 eyes of 72 participants at the baseline timepoint. All ICGA images and SD-OCT B-scans were resized using bilinear interpolation to a standard size of 512 × 512 pixels.

### Quantitative Analysis

Table 3, Table 4, Table 5 show the average evaluation metrics of the proposed method and alternative model variants on all 72 participants. Table 3 shows the DSC. Tables 4 and 5 show the Pearson’s correlation and the absolute difference between the manual and automatic clinical measurements of interest obtained from the segmentations, respectively. For ease of visual comparison, Figures 5 and 6 show the radar chart of the DSC and correlation, respectively, whereby a larger area of the radar chart indicates better performance. Note that a radar chart of the absolute difference would not be as useful due to the different orders of magnitudes of the clinical measurements. Overall, the proposed PCV-Net outperformed the baseline variant on all evaluation metrics with higher DSCs, higher correlations, and lower absolute differences. In Table 3, the proposed PCV-Net had the highest DSCs for 6 of the 9 biomarkers, specifically the BVN, ILM–EZ, RPE–BM, BM–CSI, IRF, and SRF, and was at least the next best-performing variant for the other biomarkers. Similar results can be observed in Tables 4 and 5 whereby the proposed PCV-Net had the best performance for many clinical measurements of interest and was at least the next best performing variant in most of the other cases. In general, improving trends were observed across the model variants as more technical specifications were added, demonstrating the importance of each aspect of the proposed method. Scatterplots of the manual and automatic clinical measurements of interest are available in Online Supplementary Information 1 and an inter-reader analysis is available in Online Supplementary Information 2.
Table 3Average (Mean ± Standard Error, Median) DSC of the Proposed Method and Alternative Model Variants on All 72 Participants
ModelBaselineIntermediateProposed PCV-Net
1S1E2S2E3S3E
ICGA
Polypoidal lesions0.42 ± 0.03, 0.520.49 ± 0.03, 0.540.40 ± 0.03, 0.420.45 ± 0.03, 0.500.39 ± 0.03, 0.450.47 ± 0.03, 0.54
BVN0.41 ± 0.03, 0.430.43 ± 0.03, 0.450.39 ± 0.03, 0.420.43 ± 0.03, 0.470.41 ± 0.03, 0.430.46 ± 0.03, 0.49
SD-OCT
ILM–EZ0.90 ± 0.01, 0.910.92 ± 0.00, 0.930.93 ± 0.00, 0.940.94 ± 0.00, 0.950.93 ± 0.00, 0.940.94 ± 0.00, 0.95
EZ–RPE0.66 ± 0.01, 0.680.72 ± 0.01, 0.740.77 ± 0.01, 0.800.79 ± 0.01, 0.810.77 ± 0.01, 0.790.79 ± 0.01, 0.80
RPE–BM0.46 ± 0.02, 0.460.56 ± 0.02, 0.560.65 ± 0.02, 0.650.69 ± 0.02, 0.680.66 ± 0.02, 0.660.69 ± 0.02, 0.68
BM–CSI0.77 ± 0.01, 0.780.81 ± 0.01, 0.830.85 ± 0.01, 0.870.87 ± 0.01, 0.880.85 ± 0.01, 0.870.87 ± 0.01, 0.89
IRF0.02 ± 0.00, 0.000.22 ± 0.05, 0.000.11 ± 0.04, 0.000.38 ± 0.06, 0.000.09 ± 0.03, 0.000.45 ± 0.06, 0.01
SRF0.43 ± 0.03, 0.480.52 ± 0.03, 0.580.56 ± 0.03, 0.610.61 ± 0.03, 0.690.55 ± 0.03, 0.640.61 ± 0.03, 0.71
Sub-RPE ring-like lesions0.02 ± 0.01, 0.000.01 ± 0.01, 0.000.12 ± 0.02, 0.010.11 ± 0.03, 0.000.14 ± 0.03, 0.030.11 ± 0.03, 0.00
BM = Bruch’s membrane; BVN = branching vascular network; CSI = choroidal-scleral interface; DSC = Dice similarity coefficient; E = ensemble; EZ = ellipsoid zone; ICGA = indocyanine green angiography; ILM = inner limiting membrane; IRF = intraretinal fluid; PCV = polypoidal choroidal vasculopathy; PED = pigment epithelium detachment; RPE = retinal pigment epithelium; S = single; SD = spectral domain; SRF = subretinal fluid.
The highest DSC is shown in bold, and the lowest DSC is shown in italics.
Table 4Pearson’s Correlation, r, of the Manual and Automatic Clinical Measurements of Interest by the Proposed Method and Alternative Model Variants on all 72 Participants
ModelBaselineIntermediateProposed PCV-Net
1S1E2S2E3S3E
ICGA
Polypoidal lesions area (mm2)0.590.730.690.790.810.80
BVN area (mm2)0.170.140.350.440.470.47
SD-OCT
Retinal volume (mm3)0.860.900.970.980.970.98
Retinal average height (mm)0.920.940.980.980.970.98
Choroidal volume (mm3)0.490.700.830.870.800.89
Choroidal average height (mm)0.510.710.830.870.790.88
SIRE (including PED) volume (mm3)0.680.650.880.900.850.89
SIRE (including PED) average height (mm)0.720.780.890.910.890.90
SIRE (excluding PED) volume (mm3)0.710.800.830.890.860.87
SIRE (excluding PED) average height (mm)0.460.640.730.680.620.62
PED volume (mm3)0.660.620.870.900.840.89
PED maximum height (mm)0.540.760.870.930.870.93
EZ defects area (mm2)0.850.860.890.910.900.92
IRF volume (mm3)0.170.540.490.650.390.63
IRF average height (mm)0.170.200.180.300.190.21
SRF volume (mm3)0.840.900.930.940.930.94
SRF average height (mm)0.710.800.860.890.830.88
Sub-RPE ring-like lesions volume (mm3)0.080.020.500.530.570.56
Sub-RPE ring-like lesions average height (mm)0.01−0.130.410.380.490.43
BVN = branching vascular network; E = ensemble; EZ = ellipsoid zone; ICGA = indocyanine green angiography; IRF = intraretinal fluid; PCV = polypoidal choroidal vasculopathy; PED = pigment epithelium detachment; RPE = retinal pigment epithelium; S = single; SD = spectral domain; SIRE = shallow irregular RPE elevation; SRF = subretinal fluid.
The highest correlation is shown in bold, and the lowest correlation is shown in italics.
Table 5Average (Mean ± Standard Error, Median) Absolute Difference of the Manual and Automatic Clinical Measurements of Interest by the Proposed Method and Alternative Model Variants on all 72 Participants
ModelBaselineIntermediateProposed PCV-Net
1S1E2S2E3S3E
ICGA
Polypoidal lesions area (mm2)0.08 ± 0.01,

0.06
0.08 ± 0.01,

0.05
0.07 ± 0.01,

0.05
0.08 ± 0.01,

0.06
0.07 ± 0.01,

0.05
0.08 ± 0.01,

0.06
BVN area (mm2)3.67 ± 0.38,

2.66
3.63 ± 0.37,

2.51
3.05 ± 0.34,

2.26
3.02 ± 0.31,

2.01
2.69 ± 0.31,

1.72
2.81 ± 0.31,

2.06
SD-OCT
Retinal volume (mm3)0.81 ± 0.15,

0.36
0.49 ± 0.13,

0.18
0.31 ± 0.06,

0.13
0.26 ± 0.06,

0.12
0.32 ± 0.06,

0.14
0.25 ± 0.06,

0.14
Retinal average height (mm)0.02 ± 0.00,

0.01
0.01 ± 0.00,

0.01
0.01 ± 0.00,

0.01
0.01 ± 0.00,

0.01
0.02 ± 0.00,

0.01
0.01 ± 0.00,

0.01
Choroidal volume (mm3)1.67 ± 0.16,

1.29
1.33 ± 0.14,

1.17
1.12 ± 0.10,

1.00
1.03 ± 0.09,

0.93
1.28 ± 0.11,

1.16
0.97 ± 0.08,

0.79
Choroidal average height (mm)0.06 ± 0.01,

0.05
0.05 ± 0.00,

0.04
0.04 ± 0.00,

0.04
0.04 ± 0.00,

0.04
0.05 ± 0.00,

0.04
0.04 ± 0.00,

0.04
SIRE (including PED) volume (mm3)0.49 ± 0.12,

0.16
0.41 ± 0.13,

0.14
0.33 ± 0.08,

0.11
0.27 ± 0.07,

0.08
0.33 ± 0.09,

0.10
0.27 ± 0.07,

0.09
SIRE (including PED) average height (mm)0.04 ± 0.00,

0.02
0.03 ± 0.00,

0.01
0.03 ± 0.00,

0.01
0.02 ± 0.00,

0.01
0.02 ± 0.00,

0.02
0.02 ± 0.00,

0.01
SIRE (excluding PED) volume (mm3)0.10 ± 0.01,

0.07
0.11 ± 0.01,

0.07
0.09 ± 0.01,

0.06
0.07 ± 0.01,

0.05
0.07 ± 0.01,

0.05
0.07 ± 0.01,

0.05
SIRE (excluding PED) average height (mm)0.01 ± 0.00,

0.01
0.01 ± 0.00,

0.01
0.01 ± 0.00,

0.01
0.01 ± 0.00,

0.01
0.01 ± 0.00,

0.01
0.01 ± 0.00,

0.01
PED volume (mm3)0.46 ± 0.12,

0.10
0.36 ± 0.13,

0.08
0.30 ± 0.08,

0.05
0.24 ± 0.07,

0.04
0.30 ± 0.09,

0.07
0.23 ± 0.08,

0.04
PED maximum height (mm)0.17 ± 0.02,

0.08
0.10 ± 0.01,

0.07
0.07 ± 0.01,

0.03
0.05 ± 0.01,

0.03
0.07 ± 0.01,

0.03
0.05 ± 0.01,

0.03
EZ defects area (mm2)2.36 ± 0.23,

1.98
2.56 ± 0.25,

2.21
2.06 ± 0.22,

1.49
1.96 ± 0.24,

1.27
1.85 ± 0.22,

1.28
1.82 ± 0.22,

1.06
IRF volume (mm3)0.03 ± 0.01,

0.02
0.02 ± 0.01,

0.00
0.03 ± 0.00,

0.01
0.02 ± 0.01,

0.00
0.03 ± 0.01,

0.01
0.02 ± 0.01,

0.00
IRF average height (mm)0.03 ± 0.00,

0.03
0.03 ± 0.00,

0.02
0.03 ± 0.00,

0.03
0.03 ± 0.00,

0.01
0.03 ± 0.00,

0.02
0.03 ± 0.00,

0.01
SRF volume (mm3)0.34 ± 0.08,

0.11
0.32 ± 0.08,

0.10
0.24 ± 0.07,

0.06
0.19 ± 0.06,

0.04
0.22 ± 0.06,

0.06
0.17 ± 0.05,

0.05
SRF average height (mm)0.02 ± 0.00,

0.01
0.02 ± 0.00,

0.01
0.02 ± 0.00,

0.01
0.01 ± 0.00,

0.01
0.02 ± 0.00,

0.01
0.01 ± 0.00,

0.01
Sub-RPE ring-like lesions volume (mm3)0.01 ± 0.00,

0.01
0.01 ± 0.00,

0.01
0.01 ± 0.00,

0.01
0.01 ± 0.00,

0.01
0.01 ± 0.00,

0.01
0.01 ± 0.00,

0.01
Sub-RPE ring-like lesions average height (mm)0.05 ± 0.00,

0.04
0.07 ± 0.00,

0.06
0.05 ± 0.00,

0.04
0.06 ± 0.00,

0.05
0.04 ± 0.00,

0.04
0.05 ± 0.00,

0.05
BVN = branching vascular network; E = ensemble; EZ = ellipsoid zone; ICGA = indocyanine green angiography; IRF = intraretinal fluid; PCV = polypoidal choroidal vasculopathy; PED = pigment epithelium detachment; RPE = retinal pigment epithelium; S = single; SD = spectral domain; SIRE = shallow irregular RPE elevation; SRF = subretinal fluid.
The lowest absolute difference is shown in bold, and the highest absolute difference is shown in italics.

### Qualitative Analysis

Figures 7 and 8 show examples of the manual segmentations by the 2 independent clinicians and automatic segmentations by the proposed method and alternative model variants on ICGA and SD-OCT images of PCV. More examples are available in Online Supplementary Information 3. Overall, the proposed PCV-Net produced the best automatic segmentations among the model variants as it had the highest similarity when compared to the manual segmentations by clinicians. While the alternative variants showed reasonable performance on images with minor pathology, PCV-Net outperformed the alternative variants on difficult images with severe pathology. In several challenging cases, there was poor agreement even between the manual segmentations by clinicians, highlighting the complexity and ambiguity of the pathology.

## Discussion

We have proposed and developed PCV-Net, a deep learning-based hybrid algorithm for joint multimodal automatic segmentation of multiple PCV-biomarkers on ICGA and SD-OCT images. The PCV-Net consisted of a hybrid network architecture comprised of a 2-D ICGA segmentation branch and 3-D SD-OCT segmentation branch connected via fusion attention modules which we developed to effectively use the spatial correspondence between the imaging modalities.
Overall, the proposed method showed good performance compared to alternative model variants based on both quantitative and qualitative analyses when applied to images from 2 PCV clinical studies. Compared to the baseline variant, which was analogous to using a 2-D U-Net and 3-D U-Net to operate separately on the ICGA and SD-OCT images without using the spatial correspondence information between the imaging modalities, the proposed method improved the DSC by 0.04 to 0.43 across the different biomarkers, increased the correlations, and decreased the absolute differences of clinical measurements of interest. While most of the performance improvement could be attributed to the fusion attention modules, we also showed that the additional aspects of self-supervised pretraining and ensembling further boosted the algorithm’s performance. Although the quantitative improvements resulting from self-supervised pretraining and ensembling were smaller, large qualitative improvements were observed in the segmentations. For example, ensembling reduced regions of noisy segmentations and self-supervised pretraining improved the segmentations of some of the rarer biomarkers such as IRF and sub-RPE ring-like lesions.
The inter-reader analysis showed that the proposed PCV-Net has room for further improvement before it can achieve performance on par with manual segmentations by clinicians. In particular, there was a significant difference of a DSC of ≥ 0.20 between manual and automatic segmentations for polypoidal lesions and BVN on ICGA and sub-RPE ring-like lesions on SD-OCT, indicating the most important biomarkers to focus on for further improvement. Besides that, the inter-reader analysis also highlighted the difficulty of manual segmentations to obtain an accurate gold standard in such a complex disease. Sub-RPE ring-like lesions, polypoidal lesions, IRF, BVN, and EZ are notoriously difficult to segment even for experienced clinicians, and the independent clinicians (C.H.V. and J.N.M.J-Y./ C.H.V. and A.B.J.) only agreed 38% to 65% of the time for these biomarkers. Upon further analysis, some of the disagreements between the clinicians could be attributed to manual segmentation errors by 1 of the clinicians or differing opinions in ambiguous regions of the images. However, we note that differing opinions or differing approaches to segmenting these ambiguous regions may not necessarily have a significant impact on clinical outcome, as long as the opinions and approaches remain consistent over time, as has been shown in previous work.
• Loo J.
• Clemons T.E.
• Chew E.Y.
• et al.
Beyond performance metrics: automatic deep learning retinal OCT analysis reproduces clinical trial outcome.
As we continue to develop PCV-Net in future work, we will explore several steps to further improve the performance of the algorithm. First, we will focus on the collection of a larger dataset with more accurate manual segmentations to create a high-quality gold standard for training and evaluation. Many deep learning-based algorithms are developed with datasets on the order of thousands or even millions of examples. While it is certainly harder to curate such large datasets in the medical domain, we will consider alternative approaches such as combining or averaging the manual segmentations from multiple clinicians to improve the accuracy and robustness of the manual segmentations. We also aim to evaluate the generalizability of PCV-Net to data from other clinical studies and institutions, as we have done with our previous algorithms,
• Loo J.
• Jaffe G.J.
• Duncan J.L.
• et al.
Validation of a deep learning-based algorithm for segmentation of the ellipsoid zone on optical coherence tomography images of an USH2A-related retinal degeneration clinical trial.
as the generalizability of an algorithm is an important aspect for clinical application. Additionally, we will also conduct intrareader analyses to provide more insight into the performance of the individual clinicians in repeated manual segmentations of the same images. Similarly, we will conduct longitudinal intrareader analyses to evaluate the consistency of both manual and automatic segmentations over time to provide a more clinically-relevant assessment of performance.
• Loo J.
• Clemons T.E.
• Chew E.Y.
• et al.
Beyond performance metrics: automatic deep learning retinal OCT analysis reproduces clinical trial outcome.
Besides that, performance gains may be achieved by increasing the size of the network or using an ensemble of a larger number of networks, albeit at the cost of increased training and testing times. Furthermore, while only simple postprocessing using morphological operations was used in PCV-Net, advanced postprocessing techniques
• Salvi M.
• Acharya U.R.
• Molinari F.
• Meiburger K.M.
The impact of pre-and post-image processing techniques on deep learning frameworks: a comprehensive review for digital pathology image analysis.
may also improve the quality of the automatic segmentations of the algorithm. We did not elect to use such advanced postprocessing techniques in PCV-Net as our focus was on highlighting the effectiveness of the hybrid network architecture with fusion attention modules.
We also note that the hybrid network architecture with fusion attention modules is not limited to assessing PCV and can easily be adapted for any application that requires the combination of 2-D and 3-D information. While the architectures of the 2-D and 3-D segmentation branches in the hybrid network were designed specifically for ICGA and SD-OCT in our application, the fusion attention modules can easily be used with different branch architectures as well.
In conclusion, we have proposed and developed PCV-Net, a deep learning-based hybrid algorithm for automatic joint multimodal segmentation of multiple biomarkers of PCV on ICGA and SD-OCT images. We have highlighted the importance of the effective use of the spatial correspondence between imaging modalities and developed fusion attention modules which can share learned features between imaging modalities of different dimensionalities without discarding information for overall improved performance. The PCV-Net can provide several clinical measurements of interest for important biomarkers, including SIRE and PED, for which we derived quantitative definitions, at a fraction of the time and cost compared to manual segmentations by clinicians. Therefore, there is great potential for the algorithm to aid clinicians in disease assessment. As the research progresses and the algorithms are further developed, we expect that it will lead to better understanding of the disease, ultimately improving clinical management of the disease and patient outcomes.

## Supplementary Data

• Supplementary information 1
• Supplementary information 2
• Supplementary information 3

## References

• Ciardella A.P.
• Donsoff I.M.
• Huang S.J.
• et al.
Polypoidal choroidal vasculopathy.
Surv Ophthalmol. 2004; 49: 25-37
• Cheung C.M.G.
• Lai T.Y.
• Ruamviboonsuk P.
• et al.
Polypoidal choroidal vasculopathy: definition, pathogenesis, diagnosis, and management.
Ophthalmology. 2018; 125: 708-724
• Spaide R.F.
• Yannuzzi L.A.
• Slakter J.S.
• et al.
Indocyanine green videoangiography of idiopathic polypoidal choroidal vasculopathy.
Retina. 1995; 15: 100-110
• Spaide R.F.
• Jaffe G.J.
• Sarraf D.
• et al.
Consensus nomenclature for reporting neovascular age-related macular degeneration data: consensus on neovascular age-related macular degeneration nomenclature study group.
Ophthalmology. 2020; 127: 616-636
• Cheung C.M.G.
• Lai T.Y.
• Teo K.
• et al.
Polypoidal choroidal vasculopathy: consensus nomenclature and non–indocyanine green angiograph diagnostic criteria from the Asia-Pacific Ocular Imaging Society PCV Workgroup.
Ophthalmology. 2021; 128: 443-452
• Teo K.Y.C.
• Gillies M.
• Fraser-Bell S.
The use of vascular endothelial growth factor inhibitors and complementary treatment options in polypoidal choroidal vasculopathy: a subtype of neovascular age-related macular degeneration.
Int J Mol Sci. 2018; 19: 2611
• Teo K.Y.C.
• Cheung C.M.G.
• et al.
Non-ICGA treatment criteria for suboptimal anti-VEGF response for polypoidal choroidal vasculopathy: APOIS PCV Workgroup Report 2.
Ophthalmol Retina. 2021; 5: 945-953
• Takahashi K.
• Ohji M.
• et al.
Efficacy and safety of intravitreal aflibercept treat-and-extend regimens in the ALTAIR study: 96-week outcomes in the polypoidal choroidal vasculopathy subgroup.
• Teo K.Y.C.
• Jordan-Yu J.M.
• Tan A.C.
• et al.
Efficacy of a novel personalised aflibercept monotherapy regimen based on polypoidal lesion closure in participants with polypoidal choroidal vasculopathy.
Br J Ophthalmol. 2022; 106: 987-993
• Maruko I.
• Ogasawara M.
• Yamamoto A.
• et al.
Two-year outcomes of treat-and-extend intravitreal aflibercept for exudative age-related macular degeneration: a prospective study.
Ophthalmol Retina. 2020; 4: 767-776
• Gomi F.
• Oshima Y.
• Mori R.
• et al.
Initial versus delayed photodynamic therapy in combination with ranibizumab for treatment of polypoidal choroidal vasculopathy: the Fujisan study.
Retina. 2015; 35: 1569-1576
• Koh A.
• Lai T.Y.
• Takahashi K.
• et al.
Efficacy and safety of ranibizumab with or without verteporfin photodynamic therapy for polypoidal choroidal vasculopathy: a randomized clinical trial.
JAMA Ophthalmol. 2017; 135: 1206-1213
• Spaide R.F.
• Donsoff I.
• Lam D.L.
• et al.
Treatment of polypoidal choroidal vasculopathy with photodynamic therapy.
Retina. 2012; 32: 529-535
• Koh A.
• Lee W.K.
• Chen L.-J.
• et al.
EVEREST study: efficacy and safety of verteporfin photodynamic therapy in combination with ranibizumab or alone versus ranibizumab monotherapy in patients with symptomatic macular polypoidal choroidal vasculopathy.
Retina. 2012; 32: 1453-1464
• Lee W.K.
• Iida T.
• Ogura Y.
• et al.
Efficacy and safety of intravitreal aflibercept for polypoidal choroidal vasculopathy in the PLANET study: a randomized clinical trial.
JAMA Ophthalmol. 2018; 136: 786-793
• Oishi A.
• Kojima H.
• Mandai M.
• et al.
Comparison of the effect of ranibizumab and verteporfin for polypoidal choroidal vasculopathy: 12-month LAPTOP study results.
Am J Ophthalmol. 2013; 156: 644-651
• Loo J.
• Woodward M.A.
• Prajna V.
• et al.
Open-source automatic biomarker measurement on slit-lamp photography to estimate visual acuity in microbial keratitis.
Transl Vis Sci Technol. 2021; 10: 2
• Kim D.Y.
• Loo J.
• Farsiu S.
• Jaffe G.J.
Comparison of single drusen size on color fundus photography and spectral-domain optical coherence tomography.
Retina. 2021; 41: 1715-1722
• Ferris III, F.L.
• Wilkinson C.
• Bird A.
• et al.
Clinical classification of age-related macular degeneration.
Ophthalmology. 2013; 120: 844-851
• Schmidt-Erfurth U.
• Gerendas B.S.
• et al.
Artificial intelligence in retina.
Prog Retin Eye Res. 2018; 67: 1-29
• Wang Z.
• Keane P.A.
• Chiang M.
• et al.
Artificial intelligence and deep learning in ophthalmology.
in: Artificial Intelligence in Medicine. Springer International Publishing, Basel, Switzerland2020: 1-34
• Rasti R.
• Allingham M.J.
• Mettu P.S.
• et al.
Deep learning-based single-shot prediction of differential effects of anti-VEGF treatment in patients with diabetic macular edema.
Biomed Opt Express. 2020; 11: 1139-1152
• Lin W.-Y.
• Yang S.-C.
• Chen S.-J.
• et al.
Automatic segmentation of polypoidal choroidal vasculopathy from indocyanine green angiography using spatial and temporal patterns.
Transl Vis Sci Technol. 2015; 4: 7
• Xu Y.
• Yan K.
• Kim J.
• et al.
Dual-stage deep learning framework for pigment epithelium detachment segmentation in polypoidal choroidal vasculopathy.
Biomed Opt Express. 2017; 8: 4061-4076
• Xu Z.
• Wang W.
• Yang J.
• et al.
Automated diagnoses of age-related macular degeneration and polypoidal choroidal vasculopathy using bi-modal deep convolutional neural networks.
Br J Ophthalmol. 2021; 105: 561-566
• Chou Y.-B.
• Hsu C.-H.
• Chen W.-S.
• et al.
Deep learning and ensemble stacking technique for differentiating polypoidal choroidal vasculopathy from neovascular age-related macular degeneration.
Sci Rep. 2021; 11: 1-9
• Cheung C.M.G.
• Bhargava M.
• Laude A.
• et al.
Asian age-related macular degeneration phenotyping study: rationale, design and protocol of a prospective cohort study.
Clin Exp Ophthalmol. 2012; 40: 727-735
• Tan A.
• Jordan-Yu J.M.
• Vyas C.H.
• et al.
Optical coherence tomography features OF polypoidal lesion closure in polypoidal choroidal vasculopathy treated with aflibercept.
Retina. 2022; 42: 114-122
• Vyas C.H.
• Cheung C.M.G.
• Jordan-Yu J.M.N.
• et al.
Novel volumetric imaging biomarkers for assessing disease activity in eyes with PCV.
Sci Rep. 2022; 12: 1-10
• Mukherjee D.
• Vann R.R.
• et al.
Correlation between macular integrity assessment and optical coherence tomography imaging of ellipsoid zone in macular telangiectasia type 2.
Invest Ophthalmol Vis Sci. 2017; 58: 291-299
• Chiu S.J.
• Li X.T.
• Nicholas P.
• et al.
Automatic segmentation of seven retinal layers in SDOCT images congruent with expert manual segmentation.
Opt Express. 2010; 18: 19413-19428
• Chiu S.J.
• Izatt J.A.
• O'Connell R.V.
• et al.
Validated automatic segmentation of AMD pathology including drusen and geographic atrophy in SD-OCT images.
Invest Ophthalmol Vis Sci. 2012; 53: 53-61
• Ronneberger O.
• Fischer P.
• Brox T.
U-net: convolutional networks for biomedical image segmentation.
in: International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer International Publishing, Basel, Switzerland2015: 234-241
• Çiçek Ö.
• Lienkamp S.S.
• et al.
3D U-Net: learning dense volumetric segmentation from sparse annotation.
in: International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer International Publishing, Basel, Switzerland2016: 424-432
• Oktay O.
• Schlemper J.
• Folgoc L.L.
• et al.
Attention u-net: learning where to look for the pancreas.
arXiv. 2018; https://doi.org/10.48550/arXiv.1804.03999
• Jing L.
• Tian Y.
Self-supervised visual feature learning with deep neural networks: a survey.
IEEE Trans Pattern Anal Mach Intell. 2020; 43: 4037-4058
• Pathak D.
• Krahenbuhl P.
• Donahue J.
• et al.
Context encoders: feature learning by inpainting.
in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, New York, NY2016: 2536-2544
• He K.
• Zhang X.
• Ren S.
• Sun J.
Delving deep into rectifiers: surpassing human-level performance on imagenet classification.
in: Proceedings of the IEEE International Conference on Computer Vision. IEEE, New York, NY2015: 1026-1034
• Sutskever I.
• Martens J.
• Dahl G.
• Hinton G.
On the importance of initialization and momentum in deep learning.
in: International Conference on Machine Learning. Journal of Machine Learning Research, Cambridge, MA2013: 1139-1147
• Milletari F.
• Navab N.
V-net: fully convolutional neural networks for volumetric medical image segmentation.
in: 2016 Fourth International Conference on 3D Vision. IEEE, New York, NY2016: 565-571
• Soille P.
Morphological Image Analysis: Principles and Applications.
Springer Science & Business Media, 2013
• Research Group ETDRS
Grading diabetic retinopathy from stereoscopic color fundus photographs—an extension of the modified Airlie house classification: ETDRS report number 10.
Ophthalmology. 1991; 98: 786-806
• Zayit-Soudry S.
• Moroz I.
• Loewenstein A.
Retinal pigment epithelial detachment.
Surv Ophthalmol. 2007; 52: 227-243
• Tsujikawa A.
• Sasahara M.
• Otani A.
• et al.
Pigment epithelial detachment in polypoidal choroidal vasculopathy.
Am J Ophthalmol. 2007; 143: 102-111.e101
• Doguizi S.
• Ozdek S.
Pigment epithelial tears associated with anti-VEGF therapy: incidence, long-term visual outcome, and relationship with pigment epithelial detachment in age-related macular degeneration.
Retina. 2014; 34: 1156-1162
• Treumer F.
• Wienand S.
• Purtskhvanidze K.
• et al.
The role of pigment epithelial detachment in AMD with submacular hemorrhage treated with vitrectomy and subretinal co-application of rtPA and anti-VEGF.
Graefes Arch Clin Exp Ophthalmol. 2017; 255: 1115-1123
• Major Jr., J.C.
• Wykoff C.C.
• Croft D.E.
• et al.
Aflibercept for pigment epithelial detachment for previously treated neovascular age-related macular degeneration.
Can J Ophthalmol. 2015; 50: 373-377
• Dice L.R.
Measures of the amount of ecologic association between species.
Ecology. 1945; 26: 297-302
• Mukaka M.M.
A guide to appropriate use of correlation coefficient in medical research.
Malawi Med J. 2012; 24: 69-71
1. MATLAB. Version 9.5.0 (R2018b). The MathWorks Inc, 2018
• Barham P.
• Chen J.
• et al.
TensorFlow: a system for large-scale machine learning.
in: OSDI, Operating Systems Design and Implementation (OSDI). USENIX Association, Berkeley, CA2016 (;16:265–283)
• Loo J.
• Clemons T.E.
• Chew E.Y.
• et al.
Beyond performance metrics: automatic deep learning retinal OCT analysis reproduces clinical trial outcome.
Ophthalmology. 2020; 127: 793-801
• Loo J.
• Jaffe G.J.
• Duncan J.L.
• et al.
Validation of a deep learning-based algorithm for segmentation of the ellipsoid zone on optical coherence tomography images of an USH2A-related retinal degeneration clinical trial.
Retina. 2022; 42: 1347-1355
• Salvi M.
• Acharya U.R.
• Molinari F.
• Meiburger K.M.
The impact of pre-and post-image processing techniques on deep learning frameworks: a comprehensive review for digital pathology image analysis.
Comput Biol Med. 2021; 128: 104129