Explaining Convolutional Neural Networks for Area Estimation of Choroidal Neovascularization via Genetic Programming

Rong, Yibiao; Yu, Kai; Xiang, Dehui; Zhu, Weifang; Fan, Zhun; Chen, Xinjian

doi:10.1007/978-3-030-00949-6_25

Yibiao Rong²⁸,
Kai Yu²⁸,
Dehui Xiang²⁸,
Weifang Zhu²⁸,
Zhun Fan²⁹ &
…
Xinjian Chen²⁸

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11039))

Included in the following conference series:

2509 Accesses
1 Citations

Abstract

Choroidal neovascularization (CNV), which will cause deterioration of the vision, is characterized by the growth of abnormal blood vessels in the choroidal layer. Estimating the area of CNV is important for proper treatment and prognosis of the disease. As a noninvasive imaging modality, optical coherence tomography (OCT) has become an important modality for assisting the diagnosis. Due to the number of acquired OCT volumes increases, automating the OCT image analysis is becoming increasingly relevant. In this paper, we train a convolutional neural network (CNN) with the raw images to estimate the area of CNV directly. Experimental results show that the performance of such a simple way is very competitive with the segmentation based methods. To explain the reason why the CNN performs well, we try to find the function being approximated by the CNN. Thus, for each layer in the CNN, we propose using a surrogate model, which is desired to have the same input and output with the layer while its mathematical expression is explicit, to fit the function approximated by this layer. Genetic programming (GP), which can automatically evolve both the structure and the parameters of the mathematical model from the data, is employed to derive the model. Primary results show that using GP to derive the surrogate models is a potential way to find the function being approximated by the CNN.

You have full access to this open access chapter, Download conference paper PDF

Automated segmentation of choroidal neovascularization in optical coherence tomography images using multi-scale convolutional neural networks with structure prior

Article 25 January 2018

Xiaoming Xi, Xianjing Meng, … Xinjian Chen

Automatic Retinal Vessel Segmentation Using BTLBO

Automated Segmentation of the Choroid in EDI-OCT Images with Retinal Pathology Using Convolution Neural Networks

Keywords

1 Introduction

Choroidal neovascularization (CNV), which will cause deterioration of the vision, is characterized by the growth of abnormal blood vessels in the choroidal layer [2]. One of the medical treatments for CNV is anti-vascular endothelial growth factor (anti-VEGF) injections, which may inhibit the growth of CNV. However, the treatment is very expensive and may be ineffective for some patients. Hence, accurate detection and quantification of CNV would be extremely useful to the clinicians in the diagnosis and evaluation of the therapeutic effect of the treatments [5].

As a non-invasion and non-contact imaging modality, Optical Coherence Tomography (OCT) [3] is becoming a powerful tool to monitor the change of CNV. Due to the number of acquired OCT volumes increases, estimating the area of CNV automatically is becoming increasingly relevant. Traditionally, segmentation based methods are popular for this process [1, 5, 12]. However, segmentation based methods are highly dependent on the intermediate steps, which inevitably induces cumulative errors.

Recently, some researchers proposed using the direct methods without segmentation for automated medical image analysis, whose effectiveness has been proved by many works [6, 7, 14]. In addition, with the revival of convolutional neural networks (CNNs) [4], the direct methods become more convenient since the CNNs can be trained directly by the raw data, and generally, achieve significant improvements compared with the traditional approaches.

In this paper, we train a CNN with raw images to estimate the area of CNV in retinal OCT images directly. The effectiveness of such a simple way is verified by the experimental results. However, the CNN is notorious due to the drawback that it is difficult to derive the function being approximated by it via studying the structure of the CNN. To deal with this problem, we try to find a surrogate model, whose mathematical expression is explicit, to fit the function of each layer in the CNN. To this end, genetic programming (GP) [9], which can automatically evolve both the structure and the parameters of the mathematical model from the data, is employed to derive the model. The contributions of this paper are concluded as follows.

We demonstrate that using CNNs to estimate the area of CNV in retinal OCT images without segmentation is very promising.
We propose using GP to derive mathematical models to fit the function being approximated by CNNs, which is a potential way to explain how do CNNs work.

The rest of this paper is organized as follows: In Sect. 2, the details of CNNs and GP are described. Section 3 presents the experimental results, including the performance of the CNN on direct area estimation and the functions obtained by GP. We have discussion and conclusion in Sect. 4.

2 Methods

2.1 Convolutional Neural Networks

Convolutional neural networks can be employed for different kinds of problems by constructing a proper network structure $C_{\theta }$ and cost function $\phi $ given the training set $\{(p_s, q_s)\}_{s=1}^S$, where $p_s$ is the observed value and $q_s$ is the desired output. This process can be formulated as

$$\begin{aligned} C_{learn} = \min _{C_\theta ,\theta \epsilon \varTheta }(\sum _{s=1}^S \phi (q_s, C_{\theta }(p_s))) \end{aligned}$$

(1)

where $\varTheta $ is the set of all possible parameters. Once the learning step is complete, $C_{learn}$ can be used to the specific problem. In this work, we employ the CNNs to estimate the area of CNV in retinal OCT images directly. Since the area of CNV is continuous variable, using CNNs to estimate the area directly is a regression problem. The least squares is used as the loss function. Thus, the problem is formulated as

$$\begin{aligned} R_{learn} = \min _{R_\theta ,\theta \in \varTheta }(\sum _{s=1}^S||y_s-R_\theta (x_s)||^2) \end{aligned}$$

(2)

where $x_s$ is an OCT image and $y_s$ is the corresponding area value.

Particulary, the structure of a typical CNN includes several convolutional layers and pooling layers optionally followed by at least one fully connected layer. The structure of CNN applied on this work is shown in Table 1. The convolution kernels in all the convolutional layers are defined of rectangular size $7\times 3$. The number of filters in each convolutional layer is $32\times 2^{l-1}$, where $l=1,2,3,4$ is the $l^{th}$ convolutional layer. The receptive field size of pooling layer is $3\times 3$ with stride 2 for downsampling. At the final layer, a $1\times 1$ convolution is used to map each 256-component feature vector to the output. In total, there are four convolutional layers, three pooling layers and one fully connected layer in the CNN. It is noteworthy to point out that each convolutional layer is followed by a rectified linear units (ReLU) layer which is not displayed in the Table 1.

Table 1. The structure of the CNN applied on this work.

Full size table

2.2 Genetic Programming

Genetic programming (GP) is a biologically inspired machine learning method that evolves computer programs to perform a task. It does this by randomly generating a population of computer programs and then breeding together the best performing individuals to create a new population. Usually, the individual in a population is represented by tree structures, which consists of a set of functions, e.g. minus, square, as shown in Fig. 1(a). One of the popular applications of GP is symbolic data mining, which is a process of building an empirical mathematical model of data.

The general process of building an empirical mathematical model of data using GP is shown in Fig. 1(b). A population is first initialized. The fitness of each individual is then evaluated. The individual, which has the highest fitness value in the population, is selected if the evolution is completed. Otherwise, genetic operations are employed to breed new individual for the next generation. Generally, crossover, mutation and duplication are the common used genetic operations. The crossover operation is done by exchanging the subtrees at random positions in a pair of selected individuals, as shown in Fig. 1(c). Mutation is a process of replacing a subtree in an individual with a randomly generated tree, as shown in the upper part of Fig. 1(d). Duplication is a process which can result in a same individual, as shown in the lower part of Fig. 1(d). We refer reader to [9] for a comprehensive reading about the GP.

3 Results

3.1 Data Collection and Augmentation

The OCT images used for the experiments are acquired from 7 patients. These images with $1024\times 512 \times 128$ voxels (11.72 $\upmu $m$\times $ 5.86 $\upmu $m$\times $ 15.6 $\upmu $m), covering the volume of 6 mm $\times $ 6 mm $\times $ 2 mm, are obtained by ZEISS scanner. The number of B-scan images with CNV in these 3D OCT images is 4060, in which 60% of them are selected as training set randomly, 20% as validation set and 20% as test set. All the CNV in the B-scan images are delineated by a clinician. The ground truth used to train the CNN is calculated based on the delineated results and normalized to the range of [0, 1] by the following equation:

$$\begin{aligned} \widehat{y} = \frac{y-y_{min}}{y_{max}-y_{min}} \end{aligned}$$

(3)

where y and $\widehat{y}$ are the area values before and after normalization respectively. $y_{max}$ and $y_{min}$ are the maximal and minimal area values in the training set respectively.

Each image is cropped to a size of $512\times 256$ to save memory. In addition, we can augment the data easily due to the cropping. A point is randomly selected from the CNV region. A sub image with size of $512\times 256$ is then cropped center on this point. In this work, this cropping process is repeated ten times for each original image.

3.2 Training Details

The experiments are performed on a PC equipped with an Intel (R) Core (TM) i7-4790 M CPU at 3.60 GHZ and 8 GB of RAM capacity using MATLAB. Stochastic Gradient Descent (SGD) with momentum 0.9 is employed to train the CNN. The learning rate is set to 0.002 in the training process. Once the training done, it takes about 0.0494 s for the CNN to make a prediction.

3.3 Performance of the CNN on Area Estimation

Different kinds of metrics are employed for evaluating the performance of the CNN on area estimation, including mean square error (MSE), mean absolute error (MAE) and Pearson correlation coefficient (Pcc) [8, 13]. MSE or MAE is a quantity used to measure how close predictions are to the eventual outcomes. A smaller MSE (MAE) indicates better performance. Pcc is a measure of the linear correlation between two variables F and Y (predicted results and ground truth in this paper), which has a value between +1 and −1, where 1 is total positive linear correlation, 0 is no linear correlation, and −1 is total negative linear correlation. A higher correlation coefficient indicates better performance.

Table 2 summarises the average results obtained by the trained CNN for estimating the area of CNV directly. It is observed that the values of MSE, MAE and Pcc obtained by the direct method are 0.0022, 0.0328 and 0.9763 respectively. To evaluate the performance of the direct method further, we also compare with the segmentation methods. In this work, we select the u-net [10], which is a popular network structure in biomedical image segmentation domain, for comparisons. Figure 2 shows several estimation results on different images obtained by these two methods respectively. The red curves in each sub-figure of Fig. 2 are the CNV boundaries delineated by hand and the white curves are the boundaries obtained by u-net. Table 2 aslo summarises the average results obtained by the segmentation based method. It is observed that the average results obtained by the direct method are better than the segmentation based method.

Table 2. The results obtained by the direct method and segmentation based method respectively.

Full size table

Table 3. The mathematical models obtained by GP.

Full size table

Table 4. The performance of each mathematical Model.

Full size table

3.4 Deriving the Function Approximated by CNN

As the results shown above, the performance of CNN used for estimating the area of CNV directly is very competitive. However, the entire process looks like a black box. In addition, it is difficult to derive the function being approximated by the CNNs via studying the structure of the CNNs. To explain how does the CNN work, we try to find the function approximated by the CNN. To this end, we employ the GP, which can evolve an empirical mathematical model of data automatically, to derive the mathematical model in the last layer of the CNN, namely find a mathematical model to fit the input and output of the fc1 layer. The GPTIPS2 toolbox [11] with default settings is employed for building the mathematical models automatically. Table 3 lists several models obtained by GP. It is noteworthy to point out that $x_i$ is the $i^{th}$ variable in the input of the fc1 layer, where $i=1, 2,..., 256$ since the input of the fc1 layer is a $256\times 1$ vector.

Table 4 summarizes the performance of each model on fitting the data of fc1 layer. It is observed that the derived mathematical models only contain n ($n<256$) variables, which means that the function in the fc1 layer can be approximated by the n dimensional curve. It is also observed that the derived mathematical model could be different at each genetic programming and their performance on fitting the data would be different also. The best model among the list is model 5, who achieves Pcc of 0.9426. Although the results are not very excellent, they illustrate that using GP to derive a mathematical model for each layer in the CNN, which is desired to have the same input and output with the corresponding layer, could be feasible since the derived mathematical model would be different at each genetic programming.

4 Discussion and Conclusion

In this paper, we employ CNNs to estimate the area of CNV in retinal OCT images directly. The results show that the performance of the CNN trained by raw images is very competitive. To explain how does the trained CNN work, we try to find the function approximated by the CNN. To this end, we propose using GP to derive a surrogate model for each layer in the CNN to demonstrate the function being approximated by CNN. In this work, we derive such a model for the fc1 layer in the CNN. The experimental results show that the function in fc1 layer can be approximated by a high dimensional curve. However, so far, the GP is only suitable for the case that the input is multiple variables while the output is single variable. If we search such a surrogate model for each single variable in the CNN, the amount of calculation would be very huge. Could we derive a mathematical model to fit the layer well where both of the input and output are multiple variables? It is the issue being explored in our future work.

References

Abdelmoula, W.M., Shah, S.M., Fahmy, A.S.: Segmentation of choroidal neovascularization in fundus fluorescein angiograms. IEEE Trans. Biomed. Eng. 60(5), 1439–1445 (2013)
Article Google Scholar
Donoso, L.A., Kim, D., Frost, A., Callahan, A., Hageman, G.: The role of inflammation in the pathogenesis of age-related macular degeneration. Surv. Ophthalmol. 51(2), 137–152 (2006)
Article Google Scholar
Huang, D., et al.: Optical coherence tomography. Science 254(5035), 1178–1181 (1991)
Google Scholar
Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Liu, L., Gao, S.S., Bailey, S.T., Huang, D., Li, D., Jia, Y.: Automated choroidal neovascularization detection algorithm for optical coherence tomography angiography. Biomed. Opt. Express 6(9), 3564–3576 (2015)
Article Google Scholar
Luo, G., Dong, S., Wang, K., Zuo, W., Cao, S., Zhang, H.: Multi-views fusion CNN for left ventricular volumes estimation on cardiac MR images. IEEE Trans. Biomed. Eng. PP(99), 1924–1934 (2018)
Article Google Scholar
Manit, J., Schweikard, A., Ernst, F.: Deep convolutional neural network approach for forehead tissue thickness estimation. Curr. Dir. Biomed. Eng. 3(2), 103–107 (2017)
Google Scholar
Miao, F., et al.: A novel continuous blood pressure estimation approach based on data mining techniques. IEEE J. Biomed. Health Inform. 21(6), 1730–1740 (2017)
Article Google Scholar
Poli, R., Langdon, W.B., McPhee, N.F., Koza, J.R.: A field guide to genetic programming, pp. 1–93. Lulu.com (2008). http://www.gp-field-guide.org.uk
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Searson, D.P.: GPTIPS 2: an open-source software platform for symbolic data mining. In: Gandomi, A.H., Alavi, A.H., Ryan, C. (eds.) Handbook of Genetic Programming Applications, pp. 551–573. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-20883-1_22
Chapter Google Scholar
Tsai, C.L., Yang, Y.L., Chen, S.J., Chan, C.H., Lin, W.Y.: Automatic characterization of classic choroidal neovascularization by using AdaBoost for supervised learning. Investig. Ophthalmol. Vis. Sci. 52(5), 2767–2774 (2011)
Article Google Scholar
Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)
Article Google Scholar
Zhen, X., Islam, A., Bhaduri, M., Chan, I., Li, S.: Direct and simultaneous four-chamber volume estimation by multi-output regression. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9349, pp. 669–676. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24553-9_82
Chapter Google Scholar

Download references

Acknowledgment

This work has been supported by the National Basic Research Program of China (973 Program) under Grant 2014CB748600.

Author information

Authors and Affiliations

School of Electrical and Information Engineering, Soochow University, Suzhou, 215006, China
Yibiao Rong, Kai Yu, Dehui Xiang, Weifang Zhu & Xinjian Chen
Key Laboratory of Digital Signal and Image Processing of Guangdong Provincial, College of Engineering, Shantou University, Shantou, 515063, China
Zhun Fan

Authors

Yibiao Rong
View author publications
You can also search for this author in PubMed Google Scholar
Kai Yu
View author publications
You can also search for this author in PubMed Google Scholar
Dehui Xiang
View author publications
You can also search for this author in PubMed Google Scholar
Weifang Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Zhun Fan
View author publications
You can also search for this author in PubMed Google Scholar
Xinjian Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xinjian Chen .

Editor information

Editors and Affiliations

University College London, London, UK
Danail Stoyanov
University of Leeds, Leeds, UK
Zeike Taylor
Radboud University Medical Center, Nijmegen, The Netherlands
Francesco Ciompi
Baidu, Beijing, China
Yanwu Xu
Sunnybrook Health Science Centre, Toronto, ON, Canada
Anne Martel
Deutsches Krebsforschungszentrum (DKFZ), Heidelberg, Germany
Lena Maier-Hein
University of Warwick, Coventry, UK
Nasir Rajpoot
Radboud University Medical Centre, Nijmegen, The Netherlands
Jeroen van der Laak
Eindhoven University of Technology, Eindhoven, The Netherlands
Mitko Veta
University of Dundee, Dundee, UK
Stephen McKenna
University Hospital Coventry, Coventry, UK
David Snead
University of Dundee, Dundee, UK
Emanuele Trucco
University of Iowa, Iowa City, IA, USA
Mona K. Garvin
Soochow University, Suzhou, China
Xin Jan Chen
Medical University of Vienna, Vienna, Austria
Hrvoje Bogunovic

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rong, Y., Yu, K., Xiang, D., Zhu, W., Fan, Z., Chen, X. (2018). Explaining Convolutional Neural Networks for Area Estimation of Choroidal Neovascularization via Genetic Programming. In: Stoyanov, D., et al. Computational Pathology and Ophthalmic Medical Image Analysis. OMIA COMPAY 2018 2018. Lecture Notes in Computer Science(), vol 11039. Springer, Cham. https://doi.org/10.1007/978-3-030-00949-6_25

Download citation

DOI: https://doi.org/10.1007/978-3-030-00949-6_25
Published: 14 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00948-9
Online ISBN: 978-3-030-00949-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Explaining Convolutional Neural Networks for Area Estimation of Choroidal Neovascularization via Genetic Programming

Abstract

Similar content being viewed by others

Automated segmentation of choroidal neovascularization in optical coherence tomography images using multi-scale convolutional neural networks with structure prior

Automatic Retinal Vessel Segmentation Using BTLBO

Automated Segmentation of the Choroid in EDI-OCT Images with Retinal Pathology Using Convolution Neural Networks

Keywords

1 Introduction