## Abstract

Rupture risk assessment of abdominal aortic aneurysms (AAAs) by means of quantifying wall stress is a common biomechanical strategy. However, the clinical translation of this approach has been greatly limited due to the complexity associated with the computational tools required for its implementation. Thus, being able to estimate wall stress using nonbiomechanical markers that can be quantified as a direct outcome of clinical image segmentation would be advantageous in improving the potential implementation of said strategy. In the present work, we investigated the use of geometric indices to predict patient-specific AAA wall stress by means of a novel neural network (NN) modeling approach. We conducted a retrospective review of existing clinical images of two patient groups: 98 asymptomatic and 50 symptomatic AAAs. The images were subject to a protocol consisting of image segmentation, processing, volume meshing, finite element modeling, and geometry quantification, from which 53 geometric indices and the spatially averaged wall stress (SAWS) were calculated. SAWS estimated from finite element analysis was considered the gold standard for the predictions. We developed feed-forward NN models composed of an input layer, two dense layers, and an output layer using Keras, a deep learning library in python. The NN models were trained, tested, and validated independently for both AAA groups using all geometric indices, as well as a reduced set of indices resulting from a variable reduction procedure. We compared the performance of the NN models with two standard machine learning algorithms (MARS: multivariate adaptive regression splines and GAM: generalized additive model) and a linear regression model (GLM: generalized linear model). With the reduced sets of indices, the NN-based approach exhibited the highest mean goodness-of-fit (for the symptomatic group 0.71 and for the asymptomatic group 0.79) and lowest mean relative error (17% for both groups). In contrast, MARS yielded a mean goodness-of-fit of 0.59 for the symptomatic group and 0.77 for the asymptomatic group, with relative errors of 17% for the symptomatic group and 22% for the asymptomatic group. GAM had a mean goodness-of-fit of 0.70 for the symptomatic group and 0.80 for the asymptomatic group, with relative errors of 16% for the symptomatic group and 20% for the asymptomatic group. GLM did not perform as well as the other algorithms, with a mean goodness-of-fit of 0.53 for the symptomatic group and 0.70 for the asymptomatic group, with relative errors of 19% for the symptomatic group and 23% for the asymptomatic group. Nevertheless, the NN models required a reduced set of 15 and 13 geometric indices to predict SAWS for the symptomatic and asymptomatic AAA groups, respectively. This was in contrast to the reduced set of nine and eight geometric indices required to predict SAWS with the MARS and GAM algorithms for each AAA group, respectively. The use of NN modeling represents a promising alternative methodology for the estimation of AAA wall stress using geometric indices as surrogates, in lieu of finite element modeling. The performance metrics of NN models are expected to improve with significantly larger group sizes, given the suitability of NN modeling for “big data” applications.

## 1 Introduction

Abdominal aortic aneurysm (AAA) is a life-threatening condition where a high number of patients with a ruptured AAA die before clinical intervention is possible [1]. In current clinical practice, maximum AAA diameter (*D*_{max}) and aneurysm growth rate are the most widely utilized clinical markers for assessing progression of the disease [2]. Ruptured AAAs are the 13th leading cause of death in the United States with an associated mortality rate of 90% while causing nearly 11,000 deaths per year [3]. Gender is also a well-established risk factor for the development of an AAA, with a 4:1 male to female ratio [4]. The gold standard for recommending elective repair for AAA patients is a maximum diameter of 5.5 cm in men and 5.0 cm in women [3]. However, progressive dilation weakens the aortic wall, decreasing its ability to withstand blood pressure, and ultimately leading to AAA rupture. The latter is a mechanical event that occurs when the local wall stress exceeds the local wall strength. Based on this principle, several biomechanical markers such as peak wall stress (*PWS*), 99th percentile wall stress (*99 ^{th}WS*), spatially averaged wall stress (

*SAWS*), and rupture potential index have been postulated as superior predictors of AAA rupture compared to

*D*

_{max}alone [5–16]. As patient-specific wall stress calculations require prior knowledge of individual AAA material properties, geometric surrogates such as wall thickness, surface curvature, size, and shape measures [14,16–21] have been postulated as image-based predictors of wall stress. Such geometric indices could be used as part of a computational tool for AAA rupture risk assessment in lieu of complex finite element analysis (FEA) [14,22].

The overall objective of the present work is to investigate the potential of geometric indices to predict patient-specific AAA wall stress by means of a novel neural network (NN)-based modeling approach. NNs are a set of algorithms designed to recognize patterns in large sets of data. Previously, NN-based prediction of aortic wall stress (using geometric indices and other similar variables) has been proposed in computational studies of idealized vascular geometries [22–25]. Our goals were (1) to develop a NN-based predictive model using geometric indices as predictors and SAWS as the response; (2) investigate the prediction accuracy when using a reduced set of predictors; and (3) compare the prediction accuracy of the NN with standard machine learning (ML) algorithms.

## 2 Materials and Methods

### 2.1 Human Subjects Data.

The source of data for this study consisted of two patient groups: 50 symptomatic AAA patients who underwent an emergent aneurysm repair and 98 asymptomatic AAA patients who underwent an elective aneurysm repair. For each patient, we acquired retrospectively the last abdominal computed tomography angiography (CTA) scan available prior to the emergent or elective repair. A flowchart illustrating the protocol followed in the study is shown in Fig. 1. The human subjects' research protocol was approved by the Institutional Review Boards at Allegheny General Hospital (Pittsburgh, PA) and Northwest Memorial Hospital (Chicago, IL). Informed consent was not required as this was a retrospective review of existing, de-identified medical records.

### 2.2 Three-Dimensional Image Reconstruction.

CTA images (slice thickness 1.0–3.0 mm; 512 × 512 pixels) ranging from immediately distal to the left renal artery to approximately 5 cm distal to the iliac bifurcation were used in standard Digital Imaging and Communications in Medicine (DICOM) format. These DICOM files were utilized as inputs for the in-house matlab-based (Mathworks Inc., Natick, MA) segmentation script *AAAVasc* (v1.0.3, The University of Texas at San Antonio, San Antonio, TX) [26]. AAAVasc segments the aorta using three distinct boundaries: lumen, inner wall, and outer wall. Lumen segmentation was performed by means of a region-growing algorithm where the user selects a seed point (inside the lumen) that acts as an initiation point. The algorithm then searches for neighboring pixels in eight directions from this seed pixel (top, bottom, left, right, and the four diagonals). The search is discontinued when a significant change in intensity is achieved (measured using image pixel intensity gradient), which signifies the end of the lumen region. This process was repeated for each image in the patient-specific set of CTA images. Next, the outer wall segmentation was performed using contour detection [27]. AAAVasc provides the user with possible outer wall contours and the user selects the best-fitted outer wall. To segment the inner wall, a simple feed forward neural network (FFNN)^{1} was utilized. In this step, the user selects approximately five to ten seed points that could be part of the inner wall boundary and seed points that are not part of the wall region. The FFNN was subsequently trained and all the pixels between the lumen and the outer wall were labeled as either (a) part of the wall or (b) not part of the wall. Using the aforementioned labels, the FFNN then allocated the inner wall boundary for each image. With the three boundaries identified, the three regions (lumen, intraluminal thrombus (ILT), and wall) were subsequently exported as binary masks and as a point cloud for further processing. The ILT region was represented in each binary mask by the two-dimensional region between the inner wall and lumen boundaries.

### 2.3 Computational Analysis of Patient-Specific AAA Wall Stress.

*AAAMesh*[28,29], a quadratic hexahedral mesh, is created from the aforementioned binary masks for each AAA geometry. The volumetric mesh has a patient-specific wall thickness distribution estimated during image segmentation, similar to previously reported studies [5,14,16,30]. ILT was not meshed in the model, while the wall mesh density was fixed to account for two-element thick wall. The resulting mesh consisted of approximately 60,000–100,000 elements, which were exported in a nastran file format and processed with the finite element solver adina (Adina R&D Inc., Watertown, MA) for subsequent wall stress analysis. We previously performed a mesh sensitivity analysis for quasi-static AAA FEA modeling [27]. The patient-specific FEA simulations were performed in 24 time steps with an intraluminal pressure of 120 mmHg and the degrees-of-freedom at the proximal and distal ends fixed in all directions and rotations. A generalized neo-Hookean material model was used for the AAA wall, with its strain energy function given by Eq. (1)

where $W$ is the strain energy density, $I1$ is the second invariant of the Cauchy–Green tensor, $c1$ and $c3$ are constitutive parameters evaluated experimentally, and the Poison's ratio was taken to represent a nearly incompressible material ($\nu =0.49$) [31,32]. The parameters ($c1=17.4\u2009N/cm2$ and $c3=188.1\u2009N/cm2$) were adopted from the study by Raghavan and Vorp [31], which involved uni-axial tensile testing of AAA wall tissue specimens from 69 patients. From the FEA simulations, first principal stresses were obtained and PWS, 99^{th}WS, and SAWS were estimated using ensight (Ansys Inc., Canonsburg, PA). Previously, SAWS was reported to have stronger correlations with aneurysmal geometric indices, compared to PWS and 99^{th}WS [14,16]. These global stress measures are defined in Appendix B of the Supplementary Material on the ASME Digital Collection.

### 2.4 Quantification of Geometric Indices.

_{ave}) and max diameter (

*D*

_{max})) and two-dimensional geometric indices (e.g., wall surface area (

*S*)). The volumetric meshes were utilized to evaluate three-dimensional geometric indices such as AAA sac volume (

*V*). The Biquintic Hermite finite element method, implemented from Lee et al. [35], was used to evaluate global curvature-based indices (such as area-averaged mean curvature (MAA)). Local curvature distributions were utilized to estimate the global curvature indices using a high-order interpolation scheme. Local principal curvatures $k1$ and $k2$ are estimated using Eqs. (2) and (3)

where $a$, $b$, and $c$ are constants evaluated at every outer wall surface node. The detailed formulations and abbreviations of all geometric indices are provided in Appendix A of the Supplementary Material on the ASME Digital Collection.

### 2.5 Neural Network Architecture and Machine Learning Algorithms.

A feed-forward NN was developed consisting of an input layer, two dense layers, and an output layer (as shown in Fig. 2) using *Keras*, a deep learning library in python (Python Software Foundation, Wilmington, DE). These layers were connected using appropriate activation functions where the resultant output layer was the weighted sum of the input nodes. Four NN models were developed for both the symptomatic and asymptomatic AAA patient groups using several combination of layers and activation functions. The models were then evaluated for various epochs and the best prediction model for both groups was reported. Mean square error (MSE) and mean average error (MAE) were used to estimate the accuracy of the models. Further, we also compared these NN models with ML algorithms (specifically, generalized additive model (GAM) [36] and multivariate adaptive regression splines (MARS) [37]) applied to the same groups, following a ML protocol reported previously in Ref. [16].

### 2.6 Training, Testing, and Validation Datasets.

The data were randomly divided into training, testing, and validation datasets (see schematic of study design in Fig. 3). For the symptomatic group, the NN model was trained and tested on 40 AAAs with a 70:30 split. This trained symptomatic NN model was then validated using 10 AAAs that were not part of the training or testing datasets. Similarly, for the asymptomatic group, the NN was trained and tested on 80 AAAs with a 70:30 split, and the trained model was further validated using 18 AAAs (that were not part of the training or testing datasets). The aforementioned datasets were also utilized to train and validate the ML algorithms (GAM and MARS).

### 2.7 Variable Reduction Procedure.

*predictors*and SAWS as the

*response*. The second variable set consisted of a reduced set of variables (selected using either Pearson's correlation analysis or MARS) as

*predictors*and SAWS as the response. The variable reduction was performed using Pearson's correlation in conjunction with Bonferroni correction (to reduce the probability of a type I error while performing multiple hypotheses tests) [14,16,38]. To identify the geometric indices that have the best correlation with SAWS, a series of tests of hypothesis was carried out. Specifically, for each $j=1,\u2009\u2026,\u2009m\u2009(where\u2009m=53)$, the null hypothesis $H0(j):\u2009\rho j=0$ was tested against the alternative hypothesis $Ha(j):\u2009\rho j\u22600$, where $\rho j$ is the population correlation coefficient between the

*j*th geometric index and SAWS. For each $j,\u2009H0(j)$ is rejected for large values of $Tj$ according to Eq. (4)

where $rj$ is the sample (Pearson's) correlation coefficient between the *j*th geometric index and SAWS, $xij$ is the value of the *j*th geometric index from the *i*th patient; $xj\xaf=1n\u22111nxij$, $yi$ is SAWS for the $ith$ patient, and $y\xaf=1n\u22111nyi$. The *p*-values of these tests are computed with the fact that $Tj$ has a *t*-distribution with $n\u22122$ degrees-of-freedom when $H0(j)$ is true [14,16]. Since there are 53 geometric indices, the adjusted level of significance,$\u2009\alpha c,$ was set to $0.05/53=0.00094$ (i.e., the Bonferroni correction). Geometric indices that exhibited no significant correlations with SAWS were removed from the set of reduced geometric indices. Further, collinear indices, i.e., indices with a $\rho $ value > 0.98 among each other, were also removed. After removal of the nonsignificant variables, the reduced subset of geometric indices was utilized as input *neurons* for the NN models.

For the ML algorithms, MARS was utilized to refine the best set of geometric indices for predicting SAWS in both MARS and GAM. We also compared the prediction accuracy of MARS and GAM with all 53 indices as inputs. The correlation analysis and variable selection procedures were performed with the R [39] packages *corrgram* [40], *earth* [37], and* gam* [41], respectively. Additional information on how MARS performs the variable refinement protocol for the ML algorithms is provided in Appendix C of the Supplementary Material on the ASME Digital Collection.

### 2.8 Neural Network Optimization.

where $n$ is the sample size, $zk$ is the actual value (the FEA-estimated SAWS), and $z\u0302k$ is the predicted value (the NN-estimated SAWS).

When training the NN models, “early stopping” was used to quantify the optimum epoch for each model. Early stopping refers to the methods that allow specifying the performance measure to monitor the trigger, and once triggered, it will stop the training process. MAE was used to monitor the trigger and stop the training process. When the validation loss starts to increase in the next few epochs after the lowest MAE value has been achieved, early stopping will trigger the NN models to stop training, thereby eliminating over-fitting.

### 2.9 Comparison of Prediction Accuracies.

*R*

^{2}) for all the computational algorithms was obtained by performing a regression analysis of the predicted and estimated SAWS derived from the validation dataset. The relative error was calculated according to Eq. (8)

The aforementioned metrics were reported for the validation dataset, i.e., ten AAAs in the symptomatic group and 18 AAAs in the asymptomatic group, respectively. Both metrics were evaluated for all three algorithms (NN, GAM, and MARS) using all 53 indices and their respective reduced sets of indices.

### 2.10 Linear Regression.

Using the geometric indices selected from the Pearson's correlation with Bonferroni correction, a linear regression model was fit using the “stats” package in RStudio [42]. A generalized linear model (GLM) was fit for the asymptomatic group using the reduced geometric indices as predictors and SAWS as the response. Similarly, GLM was also used for the symptomatic group with the reduced geometric indices as predictors and SAWS as the response. For the asymptomatic group, GLM was fit using 80 AAAs and tested on the remaining 18 AAAs (that were not part of the fit); for the symptomatic group, GLM was fit using 40 AAAs and tested using the remaining 10 AAAs.

## 3 Results

### 3.1 Wall Stress.

Spatially averaged wall stress, estimated from patient-specific FEA, was higher for the symptomatic group compared to the asymptomatic group. The mean SAWS for the symptomatic and asymptomatic groups was 30.1±11.4 N/cm^{2} and 22.3±8.3 N/cm^{2}, respectively. The mean values and standard deviations of PWS, 99^{th}WS, and SAWS are included in Appendix D of the Supplementary Material on the ASME Digital Collection.

### 3.2 Geometric Indices.

The mean values and standard deviations of all geometric indices are included in Appendix D of the Supplementary Material. The NN models used these indices as predictors and SAWS as the response, with one model for the symptomatic group and another model for the asymptomatic group.

### 3.3 Neural Network Models

#### 3.3.1 Reduced Subset of Geometric Indices for the Neural Network.

For the NN models, Pearson's correlations were performed to reduce the number of redundant and nonsignificant geometric indices. Figures 4 and 5 show the correlation matrices representing all variables that correlate most with SAWS for the symptomatic AAA group (*n* = 98) and the asymptomatic AAA group (*n* = 50), respectively. The correlation coefficients between any two variables are represented by the size and color of the dot in the matrices. The reduced list of indices from the correlation analysis for both AAA groups are summarized in Table 1. Of the 53 indices, 15 and 13 indices were significantly correlated (*p*-value < 0.00094) with SAWS for the symptomatic and asymptomatic groups, respectively. Of the aforementioned reduced sets of indices, *S*, *V*, *D*_{max}, *D*_{ave}, TH_{ave}, TH_{Dmax}, TH_{median}, and dc_{max} were common to both groups. For the symptomatic group, TH_{Dmax} exhibited the highest negative association with SAWS ($\rho =0.652$, *p*-value ≪ 0.05), whereas *V*_{ILT} exhibited the highest positive association with SAWS ($\rho =0.494$, *p*-value ≪ 0.05). Similarly, for the asymptomatic group, *S* and MAA exhibited the strongest positive and negative correlations with SAWS, respectively ($\rho =0.665$, *p*-value ≪ 0.05 versus $\rho =0.555$, *p*-value ≪ 0.05).

SAWS (symptomatic AAA) | SAWS (asymptomatic AAA) | |||
---|---|---|---|---|

Geometric indices | Pearson's correlation coefficient $(\rho )$ | p-Value | Pearson's correlation coefficient $(\rho )$ | p-Value |

D_{max} | 0.441 | ≪0.05 | 0.596 | ≪0.05 |

S | 0.472 | ≪0.05 | 0.665 | ≪0.05 |

V | 0.416 | ≪0.05 | 0.654 | ≪0.05 |

IPR | –0.332 | ≪0.05 | ||

D_{maxdir} | 0.427 | ≪0.05 | ||

D_{ave} | 0.433 | ≪0.05 | 0.598 | ≪0.05 |

H | 0.415 | ≪0.05 | ||

MAA | –0.555 | ≪0.05 | ||

L | 0.400 | ≪0.05 | 0.403 | ≪0.05 |

V_{ILT} | 0.494 | ≪0.05 | ||

$\gamma $ | 0.416 | ≪0.05 | ||

TT_{max} | 0.327 | ≪0.05 | ||

dc_{max} | 0.320 | ≪0.05 | 0.393 | ≪0.05 |

dc | 0.346 | ≪0.05 | ||

TH_{mode} | –0.560 | ≪0.05 | ||

TH_{Dmax} | –0.652 | ≪0.05 | –0.389 | ≪0.05 |

TH_{median} | –0.588 | ≪0.05 | –0.331 | ≪0.05 |

TH_{min} | –0.335 | ≪0.05 | ||

TH_{ave} | –0.593 | ≪0.05 | –0.353 | ≪0.05 |

SAWS (symptomatic AAA) | SAWS (asymptomatic AAA) | |||
---|---|---|---|---|

Geometric indices | Pearson's correlation coefficient $(\rho )$ | p-Value | Pearson's correlation coefficient $(\rho )$ | p-Value |

D_{max} | 0.441 | ≪0.05 | 0.596 | ≪0.05 |

S | 0.472 | ≪0.05 | 0.665 | ≪0.05 |

V | 0.416 | ≪0.05 | 0.654 | ≪0.05 |

IPR | –0.332 | ≪0.05 | ||

D_{maxdir} | 0.427 | ≪0.05 | ||

D_{ave} | 0.433 | ≪0.05 | 0.598 | ≪0.05 |

H | 0.415 | ≪0.05 | ||

MAA | –0.555 | ≪0.05 | ||

L | 0.400 | ≪0.05 | 0.403 | ≪0.05 |

V_{ILT} | 0.494 | ≪0.05 | ||

$\gamma $ | 0.416 | ≪0.05 | ||

TT_{max} | 0.327 | ≪0.05 | ||

dc_{max} | 0.320 | ≪0.05 | 0.393 | ≪0.05 |

dc | 0.346 | ≪0.05 | ||

TH_{mode} | –0.560 | ≪0.05 | ||

TH_{Dmax} | –0.652 | ≪0.05 | –0.389 | ≪0.05 |

TH_{median} | –0.588 | ≪0.05 | –0.331 | ≪0.05 |

TH_{min} | –0.335 | ≪0.05 | ||

TH_{ave} | –0.593 | ≪0.05 | –0.353 | ≪0.05 |

Correlation coefficients $(\rho )$ > |0.5| are highlighted in bold-faced text.

#### 3.3.2 Training and Validation Loss for the NN Models.

Detailed training and validation loss characteristics for the four NN models are shown in Figs. 6(a)–6(d), which illustrates the convergence of training and validation losses over the epoch for each of the models. Training loss is defined as the overall error on the training set of the data, whereas validation loss is the overall error of the validation dataset through the trained network. Training and validation losses were estimated using MSE. Figure 6(c) shows the convergence of MSE for the training and validation data sets using the reduced set of geometric indices for the symptomatic group. The training MSE at 0 epoch was 0.9 and at 100 epochs was 0.18; similarly, the validation MSE at 0 epoch was 0.2, which converged at 100 epochs to 0.02. A similar trend can be seen for Fig. 6(d) for the asymptomatic group. Figures 6(a) and 6(b) show the training and validation losses for the symptomatic and asymptomatic groups, respectively, using all 53 geometric indices. The training and validation loss curves do not follow the trends seen in Figs. 6(c) and 6(d), which is because many indices (among the 53 indices) do not contribute toward the prediction of SAWS.

The training and testing MAE and MSE for the models are reported in Table 2. For the symptomatic AAA group, the training MAE and MSE for the model with 53 indices converged to 0.48 and 0.3, respectively, while the testing MAE and MSE for the model with 53 indices converged to 0.36 and 0.18, respectively. The training MAE and MSE for the model with 15 indices converged to 0.05 and 0.18, respectively, while the testing MSE and MAE for the model with 15 indices converged to 0.02 and 0.11, respectively. Since the MAE and MSE of the training and testing sets for the model using 53 indices was nearly two to six times higher than the model using 15 indices, we infer that a reduced set of 15 geometric indices highly correlated with SAWS is adequate to build the NN model for symptomatic AAA.

SAWS (symptomatic AAA) | SAWS (asymptomatic AAA) | |||||
---|---|---|---|---|---|---|

Dataset type | Number of geometric indices | MAE | MSE | Number of geometric indices | MAE | MSE |

Training | 53 | 0.48 | 0.30 | 53 | 0.14 | 0.03 |

Testing | 53 | 0.36 | 0.18 | 53 | 0.27 | 0.10 |

Training | 15 | 0.18 | 0.05 | 13 | 0.13 | 0.02 |

Testing | 15 | 0.11 | 0.02 | 13 | 0.17 | 0.04 |

SAWS (symptomatic AAA) | SAWS (asymptomatic AAA) | |||||
---|---|---|---|---|---|---|

Dataset type | Number of geometric indices | MAE | MSE | Number of geometric indices | MAE | MSE |

Training | 53 | 0.48 | 0.30 | 53 | 0.14 | 0.03 |

Testing | 53 | 0.36 | 0.18 | 53 | 0.27 | 0.10 |

Training | 15 | 0.18 | 0.05 | 13 | 0.13 | 0.02 |

Testing | 15 | 0.11 | 0.02 | 13 | 0.17 | 0.04 |

The NN model metrics for the asymptomatic group are also reported in Table 2. For this group, the training MAE and MSE for the model with 53 indices were 0.14 and 0.03, respectively. The testing MAE and MSE were 0.27 and 0.10, respectively. Using a reduced set of 13 indices, the training MAE (0.13) and MSE (0.02) were slightly lower than the model with 53 indices, and the testing MAE (0.17) and MSE (0.04) were also lower than the model with 53 indices. Therefore, we infer that a reduced set of 13 geometric indices highly correlated with SAWS is adequate to build the NN model for asymptomatic AAA.

#### 3.3.3 Neural Network Outputs.

The results of the NN model predictions are shown in Figs. 7(a)–7(d) and Table 3, where a comparison is shown of the prediction accuracies of the models for both groups. For the symptomatic group, the NN model based on 15 indices was trained for 70 epochs and predicted SAWS with a goodness-of-fit (*R*^{2}) of 0.71. The average relative error of the predicted SAWS was 17%. Using all 53 indices, the NN was trained for 140 epochs. The goodness-of-fit (0.71) of the predicted SAWS and the average relative error (16%) of the prediction was similar to the NN model using 15 indices. For the asymptomatic group, using the 13 indices as input nodes, the NN trained for 150 epochs predicted SAWS with a goodness-of-fit of 0.79. The average relative error of the predicted SAWS for this network was 17%. Conversely, The NN model with 53 indices as input nodes was found to have a goodness-of-fit of 0.70 with the average relative error being 27%. The latter NN model was trained for 100 epochs.

SAWS (symptomatic AAA) | SAWS (asymptomatic AAA) | |||||
---|---|---|---|---|---|---|

Algorithms | Number of geometric indices | Goodness of fit (R^{2}) | Relative error (%) | Number of geometric indices | Goodness of fit (R^{2}) | Relative error (%) |

NN | 53 | 0.71 | 15.7 | 53 | 0.70 | 26.8 |

NN | 15 | 0.71 | 16.8 | 13 | 0.79 | 17.4 |

MARS | 53 | 0.65 | 13.5 | 53 | 0.77 | 22.5 |

MARS | 9 | 0.59 | 17.4 | 8 | 0.77 | 21.7 |

GAM | 53 | 1.01 × 10^{–5} | 239.7 | 53 | 0.04 | 38.4 |

GAM | 9 | 0.70 | 15.8 | 8 | 0.80 | 20.0 |

GLM | 53 | ^{a} | ^{b} | 53 | ^{a} | ^{b} |

GLM | 15 | 0.53 | 18.5 | 13 | 0.70 | 22.6 |

SAWS (symptomatic AAA) | SAWS (asymptomatic AAA) | |||||
---|---|---|---|---|---|---|

Algorithms | Number of geometric indices | Goodness of fit (R^{2}) | Relative error (%) | Number of geometric indices | Goodness of fit (R^{2}) | Relative error (%) |

NN | 53 | 0.71 | 15.7 | 53 | 0.70 | 26.8 |

NN | 15 | 0.71 | 16.8 | 13 | 0.79 | 17.4 |

MARS | 53 | 0.65 | 13.5 | 53 | 0.77 | 22.5 |

MARS | 9 | 0.59 | 17.4 | 8 | 0.77 | 21.7 |

GAM | 53 | 1.01 × 10^{–5} | 239.7 | 53 | 0.04 | 38.4 |

GAM | 9 | 0.70 | 15.8 | 8 | 0.80 | 20.0 |

GLM | 53 | ^{a} | ^{b} | 53 | ^{a} | ^{b} |

GLM | 15 | 0.53 | 18.5 | 13 | 0.70 | 22.6 |

Model did not converge.

Relative error could not be estimated.

### 3.4 Machine Learning Models

#### 3.4.1 Reduced Subset of Geometric Indices for MARS and GAM.

The geometric indices utilized as input for the MARS and GAM algorithms were selected based on the variable reduction method built in MARS. This method chooses the optimal set of variables that contribute primarily toward a multivariable prediction model. From the application of such method, the geometric indices were reduced from 53 to nine for the symptomatic group (*L*, TH_{min}, TT_{minLoc}, GLN, *V*_{ILT}, *L*_{sac}, *D*_{neck,p}, BL, and TH_{Dmax}) and from 53 to eight for the asymptomatic group (*H*, TT_{maxLoc}, *S*, *V*, IPR, TH_{ave}, TH_{min}, and TH_{mod}). GLN, *V*_{ILT} and TH_{Dmax} were common to the reduced indices sets for the NN models and ML algorithms in the symptomatic group. Conversely, *S*, *V*, IPR, TH_{ave}, and TH_{min} were prevalent among the reduced indices for the asymptomatic group for both types of prediction algorithms.

#### 3.4.2 Machine Learning (MARS and GAM) Outputs.

The results of the MARS and GAM prediction analyses are shown in Figs. 8(a)–8(d), 9(a)–9(d), and Table 3. For the symptomatic group, SAWS predicted by the MARS algorithm using nine indices had a goodness-of-fit of 0.59 with an average relative error of 17%. The prediction from the GAM algorithm using the same nine indices exhibited a goodness-of-fit of 0.70 and an average relative error of 16% (compared to the FEA-estimated SAWS, which is considered to be the ground truth). SAWS predicted for the asymptomatic group using MARS with eight indices exhibited a goodness-of-fit of 0.77 and an average relative error of 22%. The prediction with GAM using the same eight indices exhibited a goodness-of-fit of 0.80 and an average relative error of 20% (compared to the FEA-estimated SAWS).

### 3.5 Linear Regression

#### 3.5.1 Reduced Subset of Geometric Indices for GLM

The correlation matrices illustrated in Figs. 4 and 5 were used to select indices that are significantly correlated to SAWS for both the symptomatic and asymptomatic groups. Hence, the same geometric indices used for the NN models were also used for the predictions with GLM, as 15 indices were found to be highly correlated with SAWS for the symptomatic group while 13 indices were found to be highly correlated with SAWS for the asymptomatic group (see Table 1).

#### 3.5.2 Linear Regression (GLM) Outputs.

The results of the GLM prediction analyses are shown in Figs. 8(e), 9(e), and Table 3. SAWS predicted by GLM for the symptomatic group using the reduced set of 15 indices exhibited a goodness-of-fit of 0.53 with a relative error of 19%. For the asymptomatic group, the 13 indices predicted SAWS with a goodness-of-fit of 0.70 with a relative error of 23%. The GLM models using all 53 geometric indices to predict SAWS did not converge and thus are not presented in the analysis.

## 4 Discussion

Using a novel NN analysis, we attempted to predict aneurysmal wall stress derived from patient-specific computational models, using geometric surrogates. These surrogates (or geometric indices) were unique to either the emergently repaired (symptomatic) or electively repaired (asymptomatic) AAA groups. Using the NN modeling approach, wall stress for the symptomatic and asymptomatic groups exhibited relative errors of 16.8% and 17.4%, respectively (Table 3). Further, the NN-based wall stress prediction exhibited superior model consistency when up to 15 geometric indices were utilized and, comparatively, an improved performance relative to the MARS, GAM, and GLM computational algorithms. These findings underscore the importance and future applications of such high fidelity algorithms for predicting wall stress in AAA computational models.

FEA-based wall stress assessment in AAA models has been frequently reported in literature [6,8,11,13,30,43,44] as a means to assess their rupture risk. However, the use of geometric indices to predict stress is rather limited [5,17,45,46]. Chauhan et al. [14] performed a variable refinement and wall stress prediction similar to the present work. Their work was based on multivariate statistical models to predict SAWS, in contrast with the use of NN models. Further, their study was focused only on emergently repaired AAAs, while the number of variables obtained after correlation analysis was 12, in comparison to the 15 we obtained for the symptomatic AAA group. The prediction accuracy of SAWS in Ref. [14] for the emergently repaired AAAs was similar to the one we report for the symptomatic AAA group (0.76 versus 0.71). Urrutia et al. [21] performed biomechanical and geometric analyses for three groups of AAAs, namely, surveillance (patients not recommended for elective repair), electively repaired, and emergently repaired AAAs. They utilized seven geometric indices for predicting PWS with a goodness-of-fit of 0.58, 0.64, and 0.78, respectively, for the aforementioned groups. In comparison, our number of geometric surrogates is higher, but the prediction accuracy is similar for both studies (0.64 versus 0.71 for electively repaired AAAs; 0.78 versus 0.71 for emergently repaired AAAs). Canchi et al. [47] compared the biomechanical and geometric differences between AAA models generated from Asian and Caucasian patients, and utilized geometric indices to predict 99^{th}WS in a diameter-matched cohort. They found that nine geometric indices for Asian AAAs and seven for Caucasian AAAs were able to predict 99^{th}WS with an accuracy of 0.77 and 0.87, respectively. While we cannot directly compare SAWS with 99^{th}WS, the prediction accuracy reported in Ref. [46] was higher than ours for the asymptomatic AAA group. Our results are rather comparable to those reported by Wu et al. [16], where SAWS derived from electively repaired AAAs can be efficiently predicted by geometric indices with an accuracy of 0.67. In our analysis, the prediction accuracy of SAWS for electively repaired (asymptomatic) AAAs was greater using the NN approach (0.79). One possible explanation for the superiority of NN outcomes over standard statistical methods is the ability to approximate the nonlinear response of the predictors [48]. Additionally, NNs decipher the liaison between the dependent and independent variables, during instances where the inter-relationship between variables is unknown or too complex to handle statistically.

### 4.1 Comparison of Prediction Accuracy for All the Computational Algorithms.

In the symptomatic AAA group, the highest magnitude of the goodness-of-fit was obtained for both NN models—with 53 and 15 indices (*R*^{2} of 0.71 for both) (Figs. 7(a) and 7(c)). However, the lowest relative error was exhibited by the NN model with 53 indices. Overall, the lowest and highest relative errors were obtained for the MARS model with 53 indices (13.5%) and the GAM model with 53 indices (239.7%), respectively, in the symptomatic AAA group (Table 3). For the asymptomatic group, the highest goodness-of-fit was exhibited by the NN model with reduced indices (Fig. 7(d)) and the GAM model with reduced indices (Fig. 9(d)) (*R*^{2} of 0.79 and 0.80, respectively). Concurrently, the lowest relative error (17.4%) was also reported for the NN model with reduced indices. Conversely, the highest relative error (38.4%) was reported for the GAM model with 53 indices (Table 3).

### 4.2 Neural Network Models Are Superior to Multivariate Adaptive Regression Splines and Generalized Additive Model in Predicting Spatially Averaged Wall Stress.

We compared the prediction abilities of three computational algorithms, namely, NN, MARS, and GAM. We also compared their respective capabilities with the full set of 53 and a reduced set of indices. With the NN approach, the number of input variables or *neurons* was reduced to 15 in the symptomatic (emergently repaired) AAA group and 13 in the asymptomatic (electively repaired) AAA group (Table 1), which were highly correlated with SAWS. A similar variable refinement procedure in MARS reduced the number of input variables to nine for the symptomatic and eight for the asymptomatic groups, for both MARS and GAM. For the symptomatic group, 53 variables with the MARS algorithm yielded a lower relative error compared to its corresponding reduced set analysis (13.5% versus 17.4%; Table 3). Further, for the asymptomatic group, the variable refinement procedure did not have a significant effect on the MARS-based prediction analysis (22.5% versus 21.7%). Relative errors were reduced up to 5.2 and 1.9 times upon variable refinement in the GAM-based models for the symptomatic and asymptomatic AAA groups, respectively. However, the MARS and GAM predictions did not all have similar goodness-of-fit measures as the NN models (Figs. 7–9) (*R*^{2} = 0.59, 0.70, and 0.71 for the symptomatic group, and *R*^{2} = 0.77, 0.80, and 0.79 for the asymptomatic group). This suggests that decreasing the number of input variables has mixed effects on the prediction outcomes of these algorithms. NN exhibited the highest average efficiency in the SAWS prediction for both patient groups; using all 53 variables did not yield a significant compromise in accuracy (Fig. 7). Conversely, using all 53 geometric variables as input for MARS and GAM results in poor stress predictions (Figs. 8 and 9).

### 4.3 Prediction of Spatially Averaged Wall Stress Using Abdominal Aortic Aneurysm Geometry.

In AAA models, maximum wall stress is more strongly associated with curvature measures than maximum aneurysm diameter [5,18,45]. Liljeqvist et al. [46] found that infrarenal aortic volume is strongly associated with AAA growth rate and critical for biomechanical rupture risk assessment. Similarly, Georgakarakos et al. [17] report that AAA mean centerline curvature is a significant predictor of PWS. Geometric measures derived from aneurysm morphology have been shown to be better predictors of rupture risk than the Hardman index [49], a known clinical scoring system utilized prior to endovascular surgeries. Since AAA wall stress is shape dependent, it is reasonable to use geometric indices to predict AAA wall stress in lieu of biomechanical aneurysm models.

Maximum diameter and seven other indices were common to both AAA groups for the reduced set analyses. However, MAA, IPR, TH_{min}, and d_{c} were exclusive to the asymptomatic group. In like manner*, D*_{maxdir}, *H*, *V*_{ILT,}$\gamma ,$TT_{max}** _{,}**and TH

_{mode}were exclusive to the symptomatic AAA group set. The remaining indices were thickness-based, thrombus-based, and volume-based geometric measures. For both groups, the surface area and the thrombus thickness at the maximum diameter exhibited the corresponding highest positive and negative associations with SAWS, respectively (Table 1). The significance of surface area concurs with the work of Chauhan et al. [14]. In their study, surface area was one of the strong predictors of SAWS for emergently repaired AAAs; however, this measure was not a part of the final, multivariate geometric surrogate model used to predict SAWS. ILT thickness has been directly associated with heightened expression of matrix degrading enzymes that disrupt aneurysmal wall integrity [50–52]. Noteworthy is that ILT exhibits a “stress-shield” barrier [53,54], contingent to the degree of thrombus attachment with the arterial wall [52,55]. Biomechanically, an increase in ILT thickness results in lower mean and PWSs in patient-specific AAA FEA models [54]. For a cohort of 100 electively repaired AAA models, the FEA-estimated SAWS was moderately associated with ILT thickness at

*D*

_{max}($\rho =\u22120.383$) [14], which is similar to our results ($\rho =\u22120.389$). The present work supports the notion that thrombus thickness measures exhibit significant negative associations with SAWS. While

*D*

_{max}is an important discriminatory index in the correlations, other geometric measures such a wall thickness and thrombus-based indices could be used instead of finite element modeling.

### 4.4 Limitations.

The present study is subject to important limitations, which bound the validity and application of the primary outcome of this research. The validation of the AAA wall stress predicted by geometric measures within the context of clinical outcomes is absent from this work. Therefore, it is possible that clinical symptoms do not necessarily correlate with the computationally predicted wall stresses. FEA modeling requires prior knowledge of meshing and fem software, patient-specific material properties, and the availability of high-performance computers. Thus, the use of AAA wall stress obtained in this manner has limited translational potential. In addition, ILT was not used in the FEA simulations, which would likely decrease the FEA-estimated SAWS, thereby leading to different overall prediction accuracies. Moreover, wall stress was estimated using a generalized neo-Hookean material model, rather than an anisotropic, fiber-based constitutive material model. The strong correlation between SAWS and global geometric indices is attributed to SAWS being a global measure of stress that is averaged over the entire surface of the AAA wall. Hence, SAWS is a global biomechanical parameter and is not location specific. Conversely, PWS and 99^{th}WS are location specific and may be highly correlated with regional (or local) indices such as wall thickness at that location. Further, NN algorithms have been deemed as a “black box” by some authors [58], lacking interpretability of the weights during the model building process [48], and are prone to over fitting [58]. In addition, the assessment of the number of hidden layers, nodes in a network, or the optimal configuration of a NN are not straightforward. NN models perform with high accuracy using large datasets; 50 data points for the symptomatic AAA group and 98 data points for the asymptomatic AAA group are likely not sufficiently large to demonstrate the full potential of such models. Hence, the NN models are expected to have superior performance when used with significantly larger group sizes.

## Acknowledgment

The contributions of Dr. Satish Muluk and Dr. Mark Eskandari overseeing the data collection for this study are gratefully acknowledged.

## Conflict of Interest

The authors have no conflicts of interest to disclose.

## Funding Data

National Institutes of Health (Grant No. R01HL121293; Funder ID: 10.13039/1000000668).

## Footnotes

FFNN refers to the neural network used in image segmentation, which is different from the NN model mentioned throughout the manuscript.

## References

**18**(3), pp.