A hybrid computational approach for seismic energy demand prediction


This is a repository copy of A hybrid computational approach for seismic energy demand 
prediction.

White Rose Research Online URL for this paper:
http://eprints.whiterose.ac.uk/156298/

Version: Accepted Version

Article:

Gharehbaghi, S, Gandomi, AH, Achakpour, S et al. (1 more author) (2018) A hybrid 
computational approach for seismic energy demand prediction. Expert Systems with 
Applications, 110. C. pp. 335-351. ISSN 0957-4174 

https://doi.org/10.1016/j.eswa.2018.06.009

© 2018, Elsevier. This manuscript version is made available under the CC-BY-NC-ND 4.0 
license http://creativecommons.org/licenses/by-nc-nd/4.0/.

eprints@whiterose.ac.uk
https://eprints.whiterose.ac.uk/

Reuse 

This article is distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs 
(CC BY-NC-ND) licence. This licence only allows you to download this work and share it with others as long 
as you credit the authors, but you can’t change the article in any way or use it commercially. More 
information and the full terms of the licence here: https://creativecommons.org/licenses/ 

Takedown 

If you consider content in White Rose Research Online to be in breach of UK law, please notify us by 
emailing eprints@whiterose.ac.uk including the URL of the record and the reason for the withdrawal request. 

mailto:eprints@whiterose.ac.uk
https://eprints.whiterose.ac.uk/


A HYBRID COMPUTATIONAL APPROACH FOR 1 

SEISMIC ENERGY DEMAND PREDICTION 2 
 3 

S. Gharehbaghia, A.H. Gandomib,*, and S. Achakpourc, M.N. Omidvard 4 

 5 

aDepartment of Civil Engineering, Behbahan Khatam Alanbia University of Technology, Behbahan, Iran 6 
bSchool of Business, Stevens Institute of Technology, Hoboken, NJ 07030, USA 7 

cDepartment of Civil Engineering, Iran University of Science and Technology, Narmak, Tehran, Iran 8 
dCenter of Excellence for Research in Computational Intelligence and Applications, School of Computer 9 

Science, University of Birmingham, Birmingham B15 2TT, U.K. 10 

 11 

Abstract. In this paper, a hybrid genetic programming (GP) with multiple genes is implemented for 12 

developing prediction models of spectral energy demands. A multi-objective strategy is used for 13 

maximizing the accuracy and minimizing the complexity of the models. Both structural properties and 14 

earthquake characteristics are considered in prediction models of four demand parameters. Here, the 15 

earthquake records are classified based on soil type assuming that different soil classes have linear 16 

relationships in terms of GP genes. Therefore, linear regression analysis is used to connect genes for 17 

different soil types, which results in a total of sixteen prediction models. The accuracy and 18 

effectiveness of these models were assessed using different performance metrics and their performance 19 

was compared with several other models. The results indicate that not only the proposed models are 20 

simple, but also they outperform other spectral energy demand models proposed in the literature.  21 

Keywords: Evolutionary computation; Genetic programming; Regression Analysis; Input energy; 22 

Hysteretic energy; Seismic energy spectra. 23 

  24 

 
*Corresponding author.  
Email:a.h.gandomi@stevens.edu 
Tel.: +1 (201) 216-5029; Fax: 201-216-5385 
URL: http://gandomi.beacon-center.org/ 

https://mail.google.com/mail/u/0/h/mur9z0mq4ulr/?&cs=wh&v=b&to=a.h.gandomi@stevens.edu
http://gandomi.beacon-center.org/


Table of Contents 25 
 26 
1. Introduction  27 

2. Seismic Energy Concept and Formulation 28 

3. Genetic Programming  29 

3.1. Multi-Gene Symbolic Regression 30 

3.2. Multi-Objective Genetic Programming (MOGP) 31 

3.3. Accelerating GP Process 32 

4. Predicting Seismic Energy Demand Spectra Using MOGP 33 

4.1. Preparing Exact Data 34 

4.1.1. Inelastic SDOF Systems 35 

4.1.2. Earthquake Ground Motions 36 

4.2. Model Development Using MOGP 37 

5. Results and Discussion 38 

5.1. Mathematical Model for EDP1 39 

5.2. Mathematical Model for EDP2 40 

5.3. Mathematical Model for EDP3 41 

5.4. Mathematical Model for EDP4 42 

5.5. Accuracy of the Models 43 

5.6. Comparative Study  44 

5.6.1. Assumptions  45 

5.6.2. Comparison 46 

6. Summary and Conclusion 47 

  48 


1. Introduction 49 

The approaches currently used for the seismic analysis and design of structures need to be improved 50 

through considering appropriate engineering demand parameters that would represent the 51 

characteristics of a structure and the design earthquake. In current approaches, either the structural 52 

members are designed based on satisfying the balance between the force demand and the corresponding 53 

strength supply while providing an adequate level of ductility (based on e.g. ASCE/SEI7 2010), or 54 

based on the concept whether they are force- or deformation-controlled (based on e.g. FEMA-356 55 

2000). Such approaches disregard the frequency content and duration of earthquake ground motion, as 56 

well as the velocity response and hysteretic behavior (Gupta 1990). The importance of considering 57 

these factors lies in evidences suggesting that, for instance, dissipated hysteretic energy due to repeated 58 

inelastic excursions could result in a certain amount of seismic damage (Fajfar and Vidic 1994). In fact, 59 

in addition to the force and deformation, the energy demand is of great importance in capturing the 60 

mentioned seismic factors as the inelastic behavior is expected to occur due to the design and maximum 61 

earthquakes. Housner (1956) was first to introduce these factors through defining the energy concept. 62 

This concept requires that the energy dissipation capacitybe less than the input energy demand.  63 

Both structural properties and earthquake characteristics affect the seismic energy demand. 64 

Determining the spectral values of energy demand is beneficial due to its connection tothe amount of 65 

structural damage (Fajfar and Vidic 1994, Gharehbaghi 2018). The design codes have not explicitly 66 

implemented the energy demand parameter in predicting seismic demands yet. Moreover, the priority 67 

of the energy-based design approach compared with the conventional strength-based design approach 68 

needs further studies. Although, previous studies (e.g. Housner 1956, Fajfar and Vidic 1994, Kalkan 69 

and Kunnath 2007, Akiyama 1985, Bertero and Uang 1988, Decanini and Mollaioli 2001, Manfredi 70 

2001) have stipulated that the seismic energy parameters are of great importance in seismic design of 71 

structures. It was shown that the hystertic energy demand is directly connectedto the structural damage 72 

(Fajfar and Vidic 1994, Kalkan and Kunnath 2007, Akiyama 1985, Bertero and Uang 1988, Decanini 73 

and Mollaioli 2001, Manfredi 2001, Benavent-Climent et al. 2010, Gharehbaghi 2018). After more than 74 

two decades that the Housner’s proposal was almost neglected, it was received considerable attention 75 


among researchers (Akiyama 1985, Kuwamura and Galambos 1989), and became the key issue of a 76 

conference held in Bled city of Slovenia (Fajfar and Krawinkler 1992). It was recognized that input 77 

energy and hysteretic energy are the indicators of ground motion and have a correlation with the 78 

structural damage, and the quantity related to cumulative damage is the hysteretic energy (Fajfar and 79 

Vidic 1994, Bertero and Uang 1988, Decanini and Mollaioli 2001). Most recently, Deniz et al. (2017) 80 

also found that the most appropriate and reliable intensity measure for the seismic fragility analysis of 81 

buildings is the seismic energy demand.  82 

Estimation of input and hysteretic energy demands using mathematical models could be considered 83 

as one of the important steps aligned with the extension of the energy-based seismic analysis and design. 84 

As previously mentioned, both structural and earthquake characteristics need to be accounted for the 85 

issue. Some earthquake characteristics such as soil type, earthquake magnitude, peak ground 86 

acceleration (PGA), peak ground velocity (PGV), cumulative energy index, fault type, distance from 87 

the hypocenter, were used by researchers in determining the energy spectra (e.g. Zahra and Hall 1984, 88 

Uang and Bertero 1988, Fajfar et al. 1989, Uang and Bertero 1990, Sucuoğlu and Nurtuğ 1995, 89 

Khashaee 2004). In addition to the earthquake characteristics, ductility ratio, damping ratio, and 90 

hysteretic behavior model (e.g. elastic-perfectly plastic, bilinear, pinching, Takeda, and Clough models) 91 

were the influential structural properties involved in the estimation of seismic energy demand spectra 92 

(e.g. Sucuoğlu and Nurtuğ 1995, Decanini and Mollaioli 2001, Benavent-Climent et al. 2011).Several 93 

works have been carried out on the estimation of the seismic energy demand parameters. Housner 94 

(1956) presented a model to determine input energy based on the spectral velocity of SDOF system. 95 

Kuwamura and Galambos (1989) presented energy demand spectra considering the soil type and 96 

dominant period of the earthquake. Chou and Uang (2000) estimated absorbed energy for an inelastic 97 

system by using an attenuation relation. They used nonlinear regression analysis considering both 98 

structural and earthquake variables. Manfredi (2001) proposed simple/efficient mathematical models 99 

to estimate input and hysteretic energy spectra. A dimensionless seismic index that is a function of 100 

PGA, PGV and cumulative energy was proposed to estimate the seismic energy spectra. Although the 101 

estimation models were simple and effective, the effect of soil behavior was not considered, and the 102 


number of earthquake ground motions was rather limited. Decanini and Mollaioli (2001) proposed the 103 

formulation of elastic seismic energy spectra. They also presented a comprehensive study to propose 104 

the design inelastic energy spectra by introducing the response modification factor for the input energy. 105 

Several structural variables (e.g. ductility ratio and hysteretic behavior) and earthquake characteristics 106 

such as soil type, source-to-site distance, and earthquake magnitude were considered in the proposed 107 

spectra. Arroyo and Ordaz (2007) estimated the hysteretic energy demand spectra from elastic response 108 

parameters in accordance with the earthquake events recorded in Mexico City. Their mathematical 109 

models were a function of pseudo-acceleration, velocity and displacement spectra. Elastic design input 110 

energy spectra based on Iranian earthquakes were also presented by Ghodrati Amiri et al. (2008). 111 

Recently, Dindar et al. (2015) proposed two regression-based simple mathematical models to estimate 112 

the input and hysteretic energy spectra. A database of earthquake ground motion records composed of 113 

near- and far-fault ones, PGA, soil types, earthquake magnitude, ductility ratio, and hysteretic behavior 114 

model was included in the proposed models. Using the regression analysis, Quinde et al. (2016) also 115 

proposed mathematical models to estimate the seismic energy spectra of inelastic systems located on 116 

the soft soil for Mexico City. They captured the effect of ductility ratio of inelastic systems and 117 

dominant period of the probable earthquakes on the presented models. More recently, Zhai et al. (2016) 118 

proposed an expression to account for the effect of after-shock on the input energy spectra using an 119 

equivalent velocity. Alıcı and Sucuoğlu (2016) carried out a regression analysis to estimate inelastic 120 

input energy spectrum. The prediction equations for the input energy spectra were expressed in terms 121 

of an equivalent velocity. Some crucial earthquake characteristics including soil type, epicentral 122 

distance, moment magnitude, and the fault type were considered in the proposed models. All the 123 

previously mentioned works use conventional regression methods to estimate their energy parameters 124 

of interest.  125 

Based on the capability of soft computing approaches and their recent advances, it is worthwhile to 126 

use such efficient approaches for seismic demand prediction. Computational complexity of the 127 

conventional methods and their limitations has made soft computing techniques, such as evolutionary 128 

algorithms, artificial neural networks, support vector machines, and fuzzy logic, popular for solving 129 


complex engineering problems. A common application of these tools is in predictive analysis for 130 

modeling the nonlinear dependency of the input parameters to the output value(s) where the 131 

conventional approaches (e.g. regression analysis) fail or perform poorly (Khan et al. 2003, Gandomi 132 

and Roke 2015). Despite the success of artificial neural networks (ANNs) in prediction, they are 133 

inappropriate to develop practical intelligible equations. In addition to ANNs, support vector machines 134 

(SVMs) are another primary class of soft computing methods used to discover patterns and approximate 135 

relationships when large quantities of data is available. Although both ANNs and SVMs have received 136 

significant attention (e.g. Salajegheh and Heidari 2005, Gholizadeh and Salajegheh 2009, Papadopoulos 137 

et al. 2012, Gharehbaghi and Khatibinia 2015, Khatibinia et al. 2015, Yazdani et al. 2016), they require 138 

a pre-defined and initial structure for the equation and network architecture to be determined by the 139 

user. Genetic programming (GP), a learning algorithm originated from genetic algorithms, is another 140 

well-known and successful technique for developing nonlinear mathematical models for the complex 141 

problems. GP and its variants have been effectively used for solving various problems in civil 142 

engineering (e.g. Kayadelenet al. 2009, Alavi et al. 2011, Mirzahosseini et al. 2011, Gandomi et al. 143 

2012, Vardhan et al. 2016; Lim et al. 2016). Several variants of GP have been proposed in the literature, 144 

such as gene expression programming (Ferreira 2006) and multi-stage genetic programming (Gandomi 145 

and Alavi 2011).One of the robust variants of GP is multi-gene genetic programming (MGGP) that 146 

adds the capability of conventional regression to the standard GP ability in parameter estimation. The 147 

effectiveness of MGGP has been proved in the works reported by Gandomi et al. (e.g. Gandomi and 148 

Alavi 2012a,b, Gandomi et al. 2013, Babanajad et al. 2013, Gandomi et al. 2016). 149 

Structural and earthquake engineering has benefited from the soft computing techniques in different 150 

applications. For instance, ANNs and SVMs have been widely used for risk assessment, seismic 151 

response prediction, control and health monitoring (Tsompanakis and Topping 2011). In this paper, 152 

MOGP is used for predicting the seismic energy demand spectra considering both typical structural and 153 

earthquake characteristics. For this purpose, eighteen set of the single-degree-of-freedom (SDOF) 154 

systems with the structural properties of different hardening ratios of bilinear hysteretic behavior model, 155 

damping ratios, and ductility ratios are used to determine the energy demand spectra mentioned. Also, 156 


four different sets of earthquake ground motion records based on their soil types (soft, firm, stiff and 157 

rock) with the source-to-site distances of more than 17.5 km and the magnitudes of greater than 5.5 158 

were used. It was assumed that the different soil classes have linear relationships in terms of GP genes 159 

which help to find one equation with different coefficients for different soil types. The records were 160 

scaled to two PGA levels 0.5g and 1.0g. Finally, four mathematical models corresponding to the four 161 

engineering demand parameters (EDPs) of spectral input and hysteretic energy, spectral hysteretic to 162 

input energy ratio, and spectral energy modification factor, are proposed using MOGP. Then, the 163 

effectiveness of the models is revealed using the performance metrics compared with those of available 164 

in the literature. 165 

In this study, section 2 describes the seismic energy concept and its formulation. Also this section 166 

introduces the seismic energy based EDPs which can be useful in seismic design of inelastic structures. 167 

Section 3 express a hybrid computational approach based on genetic programming used as a predictive 168 

tool herein. Section 4 describes a framework for prediction of the EDPs. A set of mathematical models 169 

are proposed and their accuracy are examined using some performance metrics in section 5. Finally, 170 

the developed models are discussed and compared with some other methods proposed in the literature. 171 

 172 

2. Seismic Energy Concept and Formulation 173 

Housner (1956) first proposed the idea of the energy-based seismic design approach. When ground 174 

motion transmits energy into a structure, some of the energy is dissipated through the damping and 175 

inelastic behavior. The remained energy of the structure is stored in the form of kinetic energy and 176 

elastic strain energy. Housner stipulated that the energy supply should be more than the energy demand 177 

during an earthquake in the form of this principle that energy supply <  energy demand for controlling 178 

and avoiding the structural collapse (Housner 1956).  179 

When a structure is subjected to earthquake excitation, its governing equation for the dynamic 180 

behavior of an inelastic SDOF system could be written as (Chopra 2012): 181 

( ) ( ) ( ( ), ( )) ( )s gmu t cu t f u t u t mu t+ + = −  (1) 


where m, c and fs represent the mass, damping coefficient and lateral resisting force of SDOF system, 182 

respectively; ü(t), ( )u t and u(t) are acceleration, velocity and displacement relative to the ground with t 183 

representing time in SDOF system, respectively; and üg(t) is the earthquake ground acceleration. 184 

Governing equation of energy equilibrium is obtained by the integration of Eq. (1) with respect to u 185 

(Uang and Bertero 1990, Chopra 2012): 186 

( ) ( ) ( ( ), ( )) ( )s gmu t du cu t du f u t u t du mu t du+ + = −     (2) 

In fact, Eq. (2) expresses the energy balance for a structural system during an earthquake while Eq. 187 

(1) explains the force balance. Substituting displacement unit (du) by velocity term and integrating it 188 

over the time of the earthquake ground motion t, Eq. (2) is expressed as: 189 

( ) ( ) ( ) ( ) ( ( ), ( )) ( ) ( ) ( )s gmu t u t dt cu t u t dt f u t u t u t dt mu t u t dt+ + = −    , (3) 

where t is the time of interest across the earthquake ground motion. Eq. (3) can also be written in a 190 

general form as follows: 191 

( ) ( ) ( ) ( )K D A IE t E t E t E t+ + = , (4) 

where EI(t), EK(t) and ED(t) are the input energy demand, kinetic energy, and damping energy, 192 

respectively; EA(t) is encompassed the recoverable elastic strain energy ES(t) and the irrecoverable 193 

plastic hysteretic energy EH(t). The amount of EH(t) is equal to zero in the elastic systems and is 194 

appeared in the inelastic systems. Therefore, Eq. (4) can be expressed as: 195 

( ) ( ) ( ) ( ) ( )K D S H IE t E t E t E t E t+ + + = , (5) 

where EK(t) and ES(t) are cumulative during the earthquake ground motion and are vanished at the end 196 

of motion of the inelastic systems. In effect, these two terms are very small in comparison with ED(t) 197 

and EH(t). Since the most portion of energy demand is dissipated through the damping energy and 198 

hysteretic energy, the Eq. (5) can be approximately written as (Uang and Bertero 1990): 199 

( ) ( ) ( )D H IE t E t E t+  . (6) 

According to Eq. (6), the main portion of input energy demand is converted to the damping energy 200 

and hysteretic energy. Besides, if the structural system remains in the elastic range during an 201 


earthquake, EH(t) is trivial, and the energy-based analysis is not useful for seismic design (Uang and 202 

Bertero 1990).As mentioned before, for design basis earthquakes, it is expected that a structure will 203 

experience inelastic cyclic deformations resulting in hysteretic energy dissipation. As previously 204 

mentioned, the hysteretic energy dissipation (EH) is directly attributed to the structural damage where 205 

the EH /EI ratio has been introduced as a good indicator of expected damage (Fajfar and Vidic 1994, 206 

Sucuoğlu and Nurtuğ 1995, Decanini and Mollaioli 2001, Manfredi 2001). For a given ductility ratio 207 

(), EH /EI ratio is defined as follow: 208 

H

I

E
HI

E





= , (7) 

where the EI and EH are the input and hysteretic energy corresponding to the ductility ratio of . 209 

Another parameter that is of crucial importance for earthquake resistant design based on the energy 210 

concept could be the response modification factor of the input energy that can be expressed as follow: 211 

I
I

I

E
RE

E



= . (8) 

The literature suggests that to have a practical energy based seismic design, the computation of the 212 

input and hysteretic energy (EI and EH), hysteretic energy to input energy ratio (HI), response 213 

modification factor of input energy (REI), and energy dissipation capacity is useful. Since the design 214 

method requires inelastic dynamic analyses resulting in the expensive computational efforts, the use of 215 

soft computing techniques is of great importance in predicting the practical mathematical models 216 

(formulations). Except for the energy dissipation capacity that needs comprehensive experimental and 217 

theoretical studies, the spectral values of the mentioned energy-based EDPs were predicted using 218 

MOGP. The next section describes the MOGP. 219 

3. Genetic Programming  220 

There are two groups of models which can be used for modeling the complex nonlinear engineering 221 

systems: phenomenological and behavioral (Gandomi et al. 2016). Phenomenological models need a 222 

predefined structure obtained from the physical laws requiring a previous understanding about the 223 

system. Concerning the complex systems, sometimes it is hard to find such models. Unlike the 224 


phenomenological models, behavioral models can be simply generated by finding a reasonable 225 

approximate relation between input variables and the output value(s) for a collection of data 226 

(experimental or theoretical) irrespective of their governing physical principles. Although one of the 227 

main advantages of behavioral modeling techniques is their independence on prior knowledge about 228 

the governing physical relationships of input and outputs (Walter and Pronzato 1997, Metenidis 2004), 229 

most of these models need the user to pre-assign a formulation pattern requiring optimization of its 230 

unknown coefficients. Concerning the complex engineering systems, the use of conventional 231 

techniques such as regression analysis cannot be guaranteed to find an accurate and reliable behavioral 232 

model (Gandomi et al. 2016). It has been well recognized that most of the structural earthquake 233 

engineering problems such as determining earthquake response of inelastic structures could be 234 

considered as such complex problems. 235 

Genetic programming (GP) (Koza 1990) is a novel behavioral modeling methodology with 236 

completely new characteristics. GP is an extension of the genetic algorithm capable of functionalizing 237 

data using tree structures. In fact, unlike classic regression models and ANNs, GP is capable of 238 

generating a prediction equation irrespective of a predefined structure. The successful application of 239 

GP and its variants have been reported in solving the various real-world problems (e.g. Gandomi and 240 

Alavi 2012a,b, Gandomi et al. 2013, Babanajad et al. 2013, Gandomi et al. 2016). One of the robust 241 

variants of GP is MGGP that adds the capability of conventional regression to the standard GP ability 242 

in parameter estimation, which has been proposed recently (Searson et al. 2007).The initial MGGP 243 

studies show that it can outperform other GP variants (Gandomi and Alavi 2012a,b). MGGP is 244 

described in the next subsection.  245 

3.1. Multi-Gene Symbolic Regression  246 

GP can generally be defined as a supervised machine learning technique that searches a program space 247 

instead of a parameter space. One of the useful variants of GP is Multi-gene genetic programming 248 

(MGGP) (Searson et al. 2007, Gandomi and Alavi2012a). MGGP is used to design mathematical model 249 

predictions which are inherently multi-gene, i.e., those models consisting of linear combinations of low 250 

order nonlinear transformations of the input variables. Unlike conventional GP that is based on an 251 


evaluation of single tree expression, MGGP uses a single GP particle swarm model selection program 252 

constructed from a number of genes where each gene has a tree expression (Searson et al. 2010). In 253 

effect, the model development procedure is decomposed by MGGP consisted of some relatively simple, 254 

fixed-depth sub-models. 255 

To develop a population of trees, GP typically uses symbolic regression. Compared with the 256 

conventional GP, a weighted linear combination of outputs from a number of GP trees is considered as 257 

a symbolic model in which each of these trees could be considered as a “gene” (Searson et al. 2010). 258 

The number of genes and tree depth of any gene can be specified by the user which their maximum 259 

values depend on the complexity of the models developed by MGGP. The evolved models are linear 260 

combinations of low-order nonlinear transformations of the predictor variables (Searson et al. 2010). 261 

The ordinary least squares method is used to estimate the linear coefficients for each of the evolved 262 

genes of an individual. A more detailed explanation of MGGP is available in Refs. (Searson et al. 2007, 263 

Searson et al. 2010). 264 

A fixed linear illustration associated with binary encoding of all parameters is used in GA, which 265 

results in a string of numbers as output. In GP however, the optimization problems are solved 266 

irrespective of a pre-defined solution structure. Depending on the problem domain, the first generation 267 

consisting of a population of possible solutions is randomly created by GP, and variation operators 268 

generate new candidate solutions then. During the evolutionary process, crossover selects a node from 269 

the parental individuals and exchanges the subtrees under the selected nodes randomly and creates a 270 

new individual. Mutation truncates and replaces one node of a tree with another randomly generated 271 

node from the same set, and then creates a new individual from an existing tree in the population. The 272 

individuals with higher fitness values have higher survival likelihood in the successive generation. To 273 

find the best fitting solution, the individual solution of a population is updated after executing a number 274 

of runs in each generation, and the parental individuals are selected from the population based on a 275 

good fitness function. To develop a population of genes, symbolic regression method can be 276 

implemented using standard GP in which a symbolic mathematical expression is directly encoded by 277 

each of the genes. Based on Figure 1 showing a typical multi-gene model, known as multi-gene 278 


symbolic regression, three input variables (x1, x2, and x3) are used to predict the response. As shown in 279 

the figure, although the nonlinear terms such as “sin” and “log” are used, the overall model is a weighted 280 

linear combination of each gene utilizing the coefficients go, g1, and g2. Mathematically, the general 281 

formulation of the multi-gene symbolic regression model can be written as (Gandomi et al. 2016): 282 

1

ˆ( , , ) ( ( ))
n

o i i
i

y g g G
=

= + x g x   (9) 

where go is a bias term; gi is the gene weight, and Gi(ș,x) is the outputs vector from the ith gene 283 

encompassing a multi-gene individual; ș is the vector of the unknown parameters for each gene; and n 284 

is the number of genes. It should be noted that the algorithmic structure of MGGP, and standard GP is 285 

the same, except for crossover and mutation of multi-gene individuals. MGGP does not necessitate any 286 

simplifying assumption in the model development process, and it is more accurate and efficient than 287 

the standard GP for modeling complex nonlinear problems (Gandomi and Alavi 2012a,b). To construct 288 

an initial population in MGGP, random individuals are created by using different nonlinear functions, 289 

input variables, and a range of random constants and each individual includes 1 and Gmax number of 290 

genes. The algorithm attempts to maximize diversity by ensuring that no individuals contain duplicate 291 

genes. The genes are randomly selected and the least squares normal equation is used to estimate the 292 

vector of unknown coefficients g as follows (Searson et al. 2007, Searson et al. 2010, Hii et al. 2011, 293 

Searson 2014): 294 

( ) 1T T−=g G G G y  (10) 
where G = [1 G1…Gn] is the gene response matrix. Since the columns of matrix G can be collinear, the 295 

Moore–Penrose pseudo-inverse (GTG)# can be computed by means of the singular value decomposition 296 

instead of the standard matrix inverse (GTG)-1.Genes can be employed or eliminated using a tree 297 

crossover operator (high-level crossover) during the MGGP run which is in addition to the standard GP 298 

sub-tree crossover (a low-level crossover). The low-level crossover chooses a gene randomly from each 299 

parent individual. Next, the standard sub-tree crossover is employed and the generated trees replace the 300 

parent trees in the otherwise unaltered individual in the next generation. The high-level crossover allows 301 


the exchange of one or more genes with another selected individual subject to the Gma x constraint. The 302 

maximum number of genes of an individual is limited to Gmax.If any individual contains more genes, 303 

the additional genes are randomly selected and deleted (Searson et al. 2007, Searson et al. 2010, Hii et 304 

al. 2011, Searson 2014). 305 

3.2. Multi-Objective Genetic Programming (MOGP) 306 

Generally, both tree-based GP and MGGP deal with a single objective optimization problem for each 307 

individual considering the defined fitness function in which, for symbolic regression, the goodness-of-308 

fit to the training data is considered as the only objective to be maximized. Although MGGP yields 309 

more compacted models compared with standard GP (Searson 2014), ineffective genes may be acquired 310 

by multi-gene models and the single objective optimization problem results in the evolution of overly 311 

complex, impractical and non-robust models (Gandomi et al. 2016). The simplest solution to eliminate 312 

such shortcoming can be provided by limiting G in a model to Gmax which is a hard-to-determine unique 313 

value for any given problem (Searson 2014). One good solution is the use of multi-objective concepts 314 

into symbolic regression which is commonly referred to as multi-objective genetic programming 315 

(MOGP). Using this methodology, both the goodness-of-fit and the complexity of the developed models 316 

can be optimized simultaneously by searching the so-called Pareto front (non-dominated solutions) set. 317 

Herein, the GPTIPS 2 toolbox (Searson 2014) associated with the related subroutines coded in 318 

MATLAB (2013) is used to solve the MOGP by using a non-dominated sorting technique (Deb et al. 319 

2002). To sort the non-dominant solutions by their complexity and precision, the non-dominated sorting 320 

method is applied at the end of each generation of the MGGP algorithm. At first, the individuals are 321 

classified from both the new and old population based on their position on the Pareto front. Pareto front 322 

of each level encompasses a set of Pareto optimal solutions which other solutions do not dominate. In 323 

addition, the solutions that include Pareto front of each level are not dominated by any other solution, 324 

apart from those of in its previous Pareto front level. After that, a “crowding factor” (i.e., the average 325 

distance of a solution from the nearest solutions (either side) on the same Pareto front) is computed for 326 

separately all individual to increase population diversity, giving lower priority to the solutions that are 327 

crowded together during the ranking process. Finally, the position of solutions is used to rank them 328 


(those on level 1 are ranked above those on level 2, and so on),and a crowding factor of each solution 329 

is used to rank those within the same level. The top 50% of the population is remained to participate in 330 

the next generation, whilst the rest are eliminated (Searson 2014). 331 

3.3. Accelerating GP Process 332 

Typically, the data sets used in engineering studies are complex and do not include a very large number 333 

of records particularly for experimental studies (Gani et al. 2016). While the successful application of 334 

GP in modeling engineering systems has been reported in the literature (e.g., (Sajjadi et al. 2016)), it 335 

can be difficult to model the systems with big data using GP. The evolutionary approaches are often 336 

slower than statistical data mining. Since GP is usually used to find the structure of solution(s), it is one 337 

of the slowest evolutionary algorithms. In addition, the extra process of non-dominated sorting of a 338 

multi-objective GP magnifies the problem. To improve this weakness, in this paper, two strategies are 339 

used in the prediction process:  340 

• 60% of data (2160 samples) were randomly selected and used for training process, and the rest 341 

(1440 samples) used as training set for each run;  342 

• The final Pareto front was determined from merging the Pareto fronts for all runs.  343 

In general, there are two classes of machine learning algorithms including trajectory based 344 

algorithms and population-based algorithms. ANNs and regression analysis are two well-known 345 

examples of trajectory algorithms. In contrast, GP is one of the mostly used population-based 346 

algorithms which it deals with a set of the solution in each generation. This feature makes it flexible to 347 

adopt with parallel processing, therefore, the paralleled computations can be used to accelerate MOGP 348 

procedure using a distributed computing machine in order to deal with the Big Data issue in GP. 349 

Although only twelve cores were used to evolve and evaluate new models herein, the number of cores 350 

can be increased up to the population size using this framework. The schematic of parallel processing 351 

in the GP process is shown in Figure 2. 352 

4. Predicting Seismic Energy Demand Spectra Using MOGP 353 

4.1. Preparing Exact Data 354 


Since the inelastic responses of an SDOF system highly depend on both structural and earthquake 355 

ground motion variables, the most influential ones are contributed in predicting spectral seismic energy 356 

demand. The input variables are described in the next subsections in detail.  357 

4.1.1. Inelastic SDOF Systems  358 

The structural variables used in this study are the hardening ratio of bilinear hysteretic behavior model 359 

(), damping ratio (), and the displacement ductility ratio () which are the structural properties used 360 

to determine the energy demand spectra mentioned. The values assumed for  were 0.0, corresponding 361 

to the elastic-perfectly plastic model, and 0.1 indicating bilinear model. Three values of 0.05, 0.10 and 362 

0.15 were also used for  of the inelastic SDOF systems. In addition, three common values of 2, 4 and 363 

6 were taken into account for . The periods range studied for the prediction of the energy spectra was 364 

between 0.01 to 5.0 second for every 0.05 second. These considered variables (, , and ) resulted in 365 

18 inelastic SDOF systems used for the prediction.  366 

4.1.2. Earthquake Ground Motions 367 

Three factors of the site class, source-to-site distance, and PGA are the three variables considered for 368 

the earthquake ground motion records used. Based on the shear wave velocities corresponding to the 369 

30 m in depth (Vs,30) of more than 750, 360 to 750, 180 to 360 and less than 180 m/s, four soil types of 370 

S1, S2, S3, and S4 were assumed for the records used. Site types of S1, S2, S3, and S4 are respectively 371 

representing the soft, firm, stiff and rock soil types. To consider the source-to-site distance, the records 372 

having the Joyne-Boor distance (RJB) in the range of more than 17.5km and less than 150km were used 373 

(known as far-fault records). In addition, two values of 0.5g and 1.0g were used for the PGA of the 374 

records. All records are non-pulse-like and have the magnitude (M) of greater than 5.5. All the records 375 

were downloaded from NGA-West-II project of PEER ground motion database (2017). The diversity 376 

of M, RJB and Vs,30, and the number of records are shown in Figure 3. As shown in this figure, the records 377 

are selected in a way that they have a large variety of the properties mentioned. The individual pseudo-378 

spectral acceleration of each record of each soil type and their mean spectra are also shown in Figure 379 

4.  380 


Four main seismic energy-based EDPs were chosen to be predicted: (i) EDP1: spectral EI/m; (ii) 381 

EDP2: spectral EH/m; (iii) EDP3: spectral HI; and (iv) EDP4: spectral REI For this purpose, based 382 

on the values assumed for the structural variables, the 18 SDOF systems were modeled and subjected 383 

to the mentioned earthquake records. A large number of inelastic time history dynamic analyses (more 384 

than 1 million) were carried out, and the exact EDPs were determined. The entire process was simulated 385 

in MATLAB platform (2013).  386 

For each EDP, individual spectral energy responses of the SDOF systems under each set of 387 

earthquake records were obtained. Then, based on the normal distribution, the mean plus one standard 388 

deviation (mean+) for each EDP under each set of the earthquake records were determined to be 389 

predicted. 390 

4.2. Model Development Using MOGP  391 

To develop powerful models, the suitable parameters ought to be utilized as a part of the MOGP 392 

predictive algorithm. To obtain the optimum MOGP models, basic arithmetic operators (+, -, ×, /) and 393 

mathematical functions (e.g. tanh, ln) were used. The models are formed by randomly combining the 394 

components from the functional set and the terminal set. The number of programs (solutions) in the 395 

population is determined by the population size, and the number of levels that the calculation would 396 

apply before the run ends are resolved as per the number of generations.The nature of thedata set, 397 

problem complexity, and the number of variables are the three determining factors for the population 398 

size and the number of generations. Note that, the upper bounds of an individual (Gmax) and the 399 

maximum tree depth (Dmax) need to be defined in order to restrict the complexity. To conduct a trade-400 

off between the running time and the complexity of the evolved solutions, optimal values of 3 and 5 401 

were respectively assumed for Gmax and Dmax. The parameter settings used for the MOGP 402 

implementation listed in Table 1 are based on the previously suggested values in the literature (Searson 403 

et al. 2007, Searson et al. 2010, Hii et al. 2011, Searson 2014) and employing a case-dependent trial-404 

and-error process. 405 

In order to generate new genes for individuals as well as to decrease the overall number of genes for 406 

one model and increase the total number of genes for the other, a “rate-based high-level crossover” 407 


through the use of a crossover rate parameter (CR) is employed herein. A uniform random number 408 

between 0 and 1 with a default value of 0.5 is generated separately for each gene in the parents. If r 409 

isless than CR, the corresponding gene is moved to the other individual. In case the exchanging process 410 

results in offspring that contain more genes than the Gmax, the gene is randomly eliminated such that the 411 

constraint is no longer violated (Searson 2014). Two data sets are needed for the analyses. Therefore, 412 

data are randomly divided into two subsets for training and validation.  The training dataset is used for 413 

learning and the validation set for determining the quality of the evolved programs on unseen data. 414 

Several combinations of training and validation sets were considered to determine a consistent data 415 

division. To evaluate the evolved expressions and finding the best-encoded one, the minimum of the 416 

root mean square error (RMSE) is used as the fitness function. RMSE can be expressed as follows: 417 

2
1

1
( )

n
i ii

RMSE e p
n =

= − , (11) 

where pi and ei are the predicted, and exact output values for the ith output, respectively; and n is the 418 

number of samples. Two objectives of maximizing the correlation coefficient (R) and minimizing the 419 

model complexity are used in MOGP approach in order to select the best final model. 420 

5. Results and Discussion  421 

Using MOGP, all EDPs (EDP1, EDP2, EDP3, and EDP4) were predicted, and their optimal 422 

mathematical models (formulations) were determined. Four cases (S1 to S4) based on different soil 423 

types of the earthquake records were considered in the prediction. Although it is quite possible that the 424 

obtained model formulations be different for different soil types, it is more practical to develop a unique 425 

mathematical model for an EDP with different coefficients. Therefore, in this paper, the complete 426 

database (which includes four soil types of S1 to S4) is employed to develop a unique prediction model 427 

for each EDP. The final mathematical model is selected based on a compromise between the prediction 428 

accuracy (as measured by the correlation coefficient R) and the model complexity (as measured by the 429 

number of input variables). After that, the complete database is divided into four groups based on the 430 

four soil types. Using each group of data, the predicted coefficients of the final model (gi) are re-431 

evaluated by conducting the regression analysis to reflect the influence of the soil type. Finally, four 432 


mathematical models including structural variables (and PGA of earthquake records) with four different 433 

groups of coefficients corresponding to the four cases mentioned above (S1 to S4) obtained are 434 

presented herein. The contribution of each input variable in the mathematical prediction models and the 435 

Pareto front obtained by using a nondominated sorting method at the end of an MOGP run are presented. 436 

The results of all EDP models developed by MOGP for EDP1 to EDP4 have been shown in Figures 437 

5(a)-(d). The Pareto front sets are shown in green circles and the rest of the models are shown in solid 438 

blue circles. As mentioned earlier, the Pareto front set is obtained by using a non-dominated sorting of 439 

populations at the end of all MOGP runs. This process simultaneously optimizes the accuracy and the 440 

complexity of all developed models. The final model in each Pareto front set is selected and highlighted 441 

in a red circle. 442 

In order to benchmark the MOGP models, they were compared with gene expression programming 443 

(GEP) models as a well-known and widely used GP algorithm. The GEP model also uses multiple gene 444 

structure, which makes it similar to the MGGP in this respect. GEP requires 10 times more generations 445 

to converge. It is because the MOGP algorithm converges quickly since it uses regression analysis 446 

beside the evolutionary process. GEP’s parameter setting is similar to that of MOGP (shown in Table 447 

1). The final results of GEP are presented in Figure 5. The results show that none of the models found 448 

by GEP are among the Pareto front sets for any of the EDP1 to EDP4 problems. 449 

 The contribution of each input variable in the mathematical prediction models can be investigated 450 

through their frequencies where a frequency value of 1.0 for a variable indicates that it has the maximum 451 

contribution within the best-generated models (Gandomi et al. 2010). It was assumed that the models 452 

with R2> 0.8 are the best-generated models. The frequency histograms of the input variables for all 453 

predicted EDPs are shown in Fig. 6. As shown, for the selected database of EDP1, PGA and 454 

 respectively show the most and the least statistically significant contributions in the best-generated 455 

MOGP models. For the collected database of EDP2, although all the input variables almost have 456 

significant contributions in the best generated MOGP models, T and  are the most and the least 457 

influential variables on the EDP2 prediction model. According to the literature (e.g. Kuwamura and 458 

Galambos 1989, Manfredi 2001, Dindar et al. 2015), there is no report about the role of PGA on EDP3 459 

https://en.wikipedia.org/wiki/Gene_expression_programming


(hysteretic to input energy ratio). Moreover, based on the physics of the problem, when the damping 460 

ratio is increased, the portion of hysteretic energy dissipation of the imparted seismic input energy is 461 

decreased. These issues have been confirmed by the frequencies of  and PGA for EDP3 where they 462 

have the largest and smallest statistically significant contributions, respectively. Moreover, as can be 463 

seen in the figure, for the collected database relevant to EDP4, T and PGA have the largest and smallest 464 

statistically significant contributions in the best-generated MOGP models, respectively.  465 

5.1. Mathematical Model for EDP1 466 

The mathematical model obtained for EDP1 is expressed as follows: 467 

( )0 1 1 2 2
I

V

E
a a X a X S

m
 = + + , (12-a) 

1 .tanh(tanh( ))X PGA T T


= , (12-b) 

1.5

2
( )

e

T PGA
X


 


−

= , (12-c) 

where a0, is the bias term, a1 and a2 are the gene weight for the EDP1 prediction model. These 468 

coefficients are listed in Table 2. SV is the spectral velocity of the elastic SDOF system.  469 

5.2. Mathematical model for EDP2 470 

The mathematical model derived for EDP2 is expressed as follows: 471 

( )0 1 1 2 2
H

D

E
b b Y b Y S

m
 = + + , (13-a) 

1
Y PGA= , (13-b) 

2

2
( 2 )( )

T
Y e PGA   −= − + − , (13-c) 

where b0, is the bias term, b1 and b2 are the gene weight for EDP2. These coefficients are listed in Table 472 

3. SD is the spectral displacement of the elastic SDOF system.  473 

5.3. Mathematical model for EDP3 474 

The mathematical model derived for EDP3 is expressed as follows: 475 


0 1 1 2 2
H

I

E
HI c c Z c Z

E





= = + + , (14-a) 

2 0.151
1

TZ e e  − − − −= , (14-b) 

2 log ( 7.75) tanh( )
T

Z T


 
= + 

 
, (14-c) 

where c0 is the bias term, c1 and c2 are the gene weight for EDP3. These coefficients are listed in Table 476 

4.  477 

5.4. Mathematical model for EDP4 478 

The mathematical model obtained for EDP4 is expressed as follows: 479 

0 1 1 2 2
I

I
I

E
RE d d U d U

E



= = + + , (15-a) 

1
( 0.422)( 1.4074) tanh( )

T
U e T T   −= + + + − − , (15-b) 

2
2

1
ln( )U T

T
  = + + , (15-c) 

where d0 is the bias term, d1 and d2 are the gene weight for EDP4. These coefficients are listed in Table 480 

5.  481 

It should be noted that EDP2 can also be determined by the following relationship: 482 

I
H I

E
E H

m


 =  (16) 

In fact, there are two ways to compute EDP2: one is by using Eq. (13) directly (called as EDP2-1), and 483 

another is by using Eq. (16) (called EDP2-2). 484 

5.5. Accuracy of the Models 485 

It should be noted that all the models are only valid in the range of actual database (discussed in section 486 

4.1). In order to investigate the effectiveness and accuracy of the EDP models in the range of our 487 

database, the common performance metrics, including the mean absolute percentage error (MAPE), the 488 

relative root mean square error (RRMSE), the linear correlation coefficient (R), the performance index 489 

(PI), coefficient of determination (r2), coefficient of efficiency (E), and the index of agreement (d) 490 


corresponding to the predicted formulation of each EDP are obtained. MAPE, RRMSE, R, PI, r2, E, and 491 

d are expressed as follows: 492 

1

1 n i i
i

i

e p
MAPE

n e=
−

=  , (16) 

2
1
( )1

n
i ii

e p
RRMSE

e n
=

−
=


, (17) 

1

2 2
1 1

( )( )

( ) ( )

n
i i i ii

n n
i i i ii i

e e p p
R

e e p p

=

= =

− −
=

− −


 

, (18) 

1

RRMSE
PI

R
=

+
, (19) 

2
2 1

2
1

( )
1

( )

n
i ii

n
ii

e p
r

p
=

=

−
= −




, (20) 

2
1

2
1

( )
1

( )

n
i ii

n
i ii

e p
E

e e
=

=

−
= −

−




,
 

(21) 

2
1

2
1

( )
1

( )

n
i ii

n
i i i ii

e p
d

e e p e
=

=

−
= −

− + −




,
 

(22) 

where ie and ip  are the average values of the exact and predicted outputs, respectively; and n, ei, and 493 

pi were defined before. Lower MAPE and RRMSE, higher R (or R2), r2, E and d indicate the accuracy 494 

and effectiveness of the prediction model used. Based on Eq. (19), higher R values and lower RRMSE 495 

values result in lower PI and, subsequently, indicate a more precise model. It should be noted that PI 496 

varies from 0 to ∞ and its values close to 0 indicate the model fits very well to the exact (actual) values. 497 

It is worth noting that two sources of complexity affect the accuracy of the models. First, the structures 498 

used are inelastic which leads to high nonlinearity, and second, the earthquake ground motion records 499 

have some influential characteristics, such as frequency content, which make a structure to experience 500 

different cyclic excursions associated with complex behavior. These problems are accentuated when 501 

PGA is increased. 502 


The abovementioned performance metrics of the predicted EDP1, EDP2 (including EDP2-1 and 503 

EDP2-2), EDP3 and EDP4 models using MOGP are listed in Table 6. As can be seen in this table, 504 

EDP1 and EDP2-2 have MAPE values respectively less than almost 16% and 17%, and it is less than 505 

33.1% for EDP2-1 all of which are in an acceptable/reasonable range of the performance metric for 506 

inelastic and complex systems. The MAPE values are very low for both EDP3 and EDP4. Lower 507 

RRMSE values (near-zero usually less than 50%) also confirm the accuracy of the mathematical models. 508 

According to Table 6, except for EDP2-1 with soil type of S4 which has an approximate RRMSE of 509 

54%, EDP1, EDP2-1 and EDP2-2 respectively have RRMSE values less than 36.3%, 44%, and 42.1%, 510 

and the remaining EDPs have RRMSE values less than 6.4%, indicating the accuracy of the predicted 511 

models. The higher accuracy of predicted EDP3 and EDP4 is evident. The R2, E, d and r2 values near 512 

1.0 (e.g. more than 0.8) indicate a good correlation, efficiency and agreement of the predicted values 513 

with the exact ones.  Based on Table 6, the R2, E, d, and r2 values are more than 0.8 for EDP1, EDP2-514 

2, EDP3, and EDP4. The minimum values of R2, E, d, and r2 of EDP2-1are almost equal to 0.68, 0.55, 515 

0.83 and 0.65 which belong to the soil type S4 whilst for other soil types almost all of them are more 516 

than 0.8. Regarding the PI, that is a combination of R and RRMSE, the values less than 0.3 and 0.2 517 

indicate a good and an excellent prediction, respectively. The PI values are less than 0.2 for EDP1, less 518 

than 0.3 for EDP2-1, and less than 0.22 for EDP2-2 which indicate a good prediction capability of their 519 

corresponding proposed MOGP models. This index is less than 0.04 for both EDP3 and EDP4, 520 

confirming the excellent prediction capability of the proposed MOGP models.  521 

To make a more informative and general evaluation of the proposed MOGP models, the average 522 

values of the performance metrics of all soil types, for each of EDPs, are presented in Table 6. The 523 

average results of EDP2 demonstrate that EDP2-2 has a better performance than EDP2-1. In fact, 524 

MAPE, RRMSE, and PI of EDP2-2 model have about 39%, 34.5%, and 37% smaller values as well as 525 

R2, E, d and r2 have almost 17%, 21%, 5.6%, and 13% larger values as compared to those of EDP2-1. 526 

Finally, considering the stochastic nature of earthquake engineering problems and the nonlinear 527 

relations governing the inelastic SDOFs behavior, the results indicate the good capability of MOGP 528 


prediction models for EDP1, EDP2-1 and EDP2-2, and excellent capability of MOGP prediction 529 

models for EDP3 and EDP4. 530 

5.6. Comparative study  531 

5.6.1. Assumptions  532 

In the previous section, it was shown that all the mathematical models precisely predict each EDP of 533 

interest. In this section, the accuracy of the proposed mathematical models of all EDPs (EDP1 to EDP4) 534 

is compared with those of some available models from the relevant literature. The works presented by 535 

Housner (1956), Kuwamura and Galambos (1989), Fajfar and Vidic (1994), Manfredi (2000), Bakhshi 536 

and Tavallali (2006), and Dindar et al. (2015) are selected to make the comparison. All the works 537 

selected herein have dealt with the inelastic SDOF systems having elastoplastic behavior model ( = 0) 538 

and damping ratio () of 0.05. The PGAs of 0.5 and 1.0g and displacement ductility ratios () of 2, 4 539 

and 6, are considered for the comparison. The well-known performance metrics including MAPE, 540 

RRMSE, R2, PI, E, d, and r2 are used for comparison. To make an informative comparison, all the 541 

performance metrics are listed in separated Tables for EDP1, EDP2, (including EDP2-1 and EDP2-2), 542 

EDP3 and EDP4 (Tables 7-10 respectively).  543 

In order to show the varying trend of the predicted models using MOGP and the best model from 544 

literature compared with the exact results, a graphical comparison is also made. As mentioned in the 545 

Introduction section and as can be concluded in the above comparison, Manfredi (2000) model is one 546 

of the best models presented in the literature. Therefore, this is the only model selected to make a more 547 

clear comparison for EDP1, EDP2 (including EDP2-1 and EDP2-2) and EDP3. Moreover, Bakhshi and 548 

Tavallali (2006) model is also used for making a comparison for EDP4. In addition, the inelastic SDOF 549 

systems having  = 0,  = 6 and  = 0.05 are used and are subjected to earthquake records corresponding 550 

to all soil types S1 to S4 with PGAs of 0.5 and 1.0g. The exact and predicted energy-based spectrum 551 

using MOGP and another selected model from the literature are shown in Figures 7-10 respectively for 552 

EDP1 to EDP4.  553 

5.6.2. Comparison 554 


The mathematical model of Eq. (12), proposed for prediction of EDP1 using MOGP, is compared with 555 

the seismic input energy equation presented by Housner (1956), Kuwamura and Galambos (1989), 556 

Manfredi (2000), and Dindar et al. (2015). As shown in Table 7, Eq. (12) and Kuwamura and Galambos 557 

(1989) models almost have similar performance for the soil type of S1 which are better than the other 558 

models. Although Eq. (12) works very well for the soil type of S2, performance metric values of 559 

Manfredi (2000) model, excepting MAPE, show that it works better than Eq. (12) and the other models. 560 

Concerning the soil types of S3 and S4, Eq. (12) has an outperformance compared with the other models 561 

in total. Based on the average values of performance metrics listed in Table 7, Eq. (12) has lower MAPE, 562 

RRMSE and PI, higher R2 and E compared with the other models (and slightly lower d and r2 compared 563 

with Manfredi (2000) model), indicating that the predicted model using MOGP totally outperforms the 564 

other models available in the literature. The varying trend of the Eq. (12) and Manfredi (2000) model 565 

compared with the exact results is also shown in Figure 7. As shown, the varying trend of Eq. (12) is 566 

almost conform to the exact results trend excepting a difference for soil type of S1 at medium-periods 567 

(see Fig. 7(a)) which is not considerable. This is also true for Manfredi (2000) model except for the 568 

long-periods of soil types of S2 and S4 shown in Fig. 7(d). 569 

The mathematical models of Eq. (13) and Eq. (16), proposed for prediction of EDP2 using MOGP, 570 

are compared with the presented models by Kuwamura and Galambos (1989), Manfredi (2000), and 571 

Dindar et al. (2015). The results of the comparison are listed in Table 8. As shown in this table, MOGP-572 

2 model (Eq. (16)) significantly outperforms MOGP-1 model (Eq. (13)). Eq. (16) is resulted to lower 573 

MAPE, RRMSE and PI, and higher R2, E, d and r2 compared with those of for other models with the 574 

exception of RRMSE, PI, R2, E and r2 of soil type of S2 and r2 of soil type of S3 for Manfredi (2000) 575 

model and soil type of S1 for Kuwamura and Galambos (1989) which are slightly better than those of 576 

for Eq. (16). To make an overall comparison, the average values of the performance metrics are listed 577 

in Table 8. These results confirm the superiority of the EDP2 prediction model using MOGP-2 (Eq. 578 

(16)) as compared to those from the literature. The varying trend of the Eq. (13), Eq. (16) and Manfredi 579 

(2000) models compared with the exact results is also shown in Figure 8. As shown, all models 580 

relatively have a consistent varying trend to the exact results except for Eq. (13) at medium- and long-581 


periods and for Manfredi model at long-periods of soil types of S1 and S4. Moreover, as shown in Figs. 582 

7 and 8, the effect of PGA on both EDP1 and EDP2 is obvious as it is confirmed by their frequencies 583 

in MOGP models depicted in Fig. 6(a) and Fig.6 (b). 584 

A similar comparison was also carried out for EDP3 prediction model of Eq. (14) using MOGP.  585 

The predicting equations presented in the works by Kuwamura and Galambos (1989), Fajfar and Vidic 586 

(1994), Manfredi (2000), and Dindar et al. (2015) were selected for comparison. The results of this 587 

comparison are shown in Table 9. According to the table, although some of the mentioned works show 588 

relatively appropriate results, the performance metrics of the prediction model based on Eq. (14) show 589 

lower MAPE, RRMSE and PI, and higher R2, E, d and r2 as compared to the other presented models, 590 

except for R2 for soil type S1 of the model used in Manfredi (2000). This indicates the novel capability 591 

of MOGP for accurate prediction of the EDP3 as compared with those presented in the literature. To 592 

make a general comparison on the performance of the MOGP prediction model of EDP3, the average 593 

values of the performance metrics are listed in Table 9. The results show that Eq. (14) highly 594 

outperforms the other presented models in the literature. Figure 9 also shows the varying trend of Eq. 595 

(14) and Manfredi (2000) model as compared to exact values. As depicted in the figure, the varying 596 

trend of Eq. (14) is in good agreement with exact values except for soil type S1 which has inconsiderable 597 

differences. Despite the exact values and Eq. (14), the Manfredi (2000) model has a constant trend 598 

which has significant differences in the majority of periods. It should be noted that, as depicted in Fig. 599 

9, EDP3 is not affected by PGA as evidenced by its very low frequency in MOGP prediction model for 600 

EDP3 shown in Fig. 6(c). 601 

EDP4 prediction model of Eq. (15) using MOGP was also compared with the Bakhshi and Tavallali 602 

(2006) model. The comparative results are shown in Table 10. According to the table, for all soil types, 603 

lower MAPE, RRMSE and PI, and higher R2, E, d and r2 is obtained for Eq. (15) with respect to the 604 

Bakhshi and Tavallali (2006) model. As shown in the table, average performance metrics are computed 605 

for making a general comparison, demonstrating the superiority of Eq. (15) using MOGP in order to 606 

predict EDP4 rather than the Bakhshi and Tavallali (2006) model. Figure 10 shows the varying trend 607 

of Eq. (15) and Bakhshi and Tavallali (2006) model compared with exact values. The shown graphs 608 


confirm the considerable differences between the Bahshi and Tavalali (2006) model and the exact 609 

values. In contrast, the MOGP model of Eq. (15) has the closest fit to the exact results. In addition, as 610 

shown in this figure, EDP4 is not influenced by PGA as evidenced by its very low frequency in MOGP 611 

prediction model for EDP4 shown in Fig. 6(d). 612 

It should be noted that most models in the literature are developed for a system with a limited 613 

number of SDOF and a low number of earthquake ground motion records. However, the proposed 614 

MOGP-based models can deal with SDOF systems with a wide range of structural features subjected 615 

to moderate-to-severe earthquake ground motions.. 616 

6. Summary and Conclusion 617 

Formulation of the seismic energy demand of inelastic SDOF systems is one of the main steps of 618 

extending the energy-based seismic analysis and design approach. A comprehensive study was carried 619 

out to propose accurate and simple mathematical models for predicting seismic energy demand spectra. 620 

Multi-objective genetic programming (MOGP) was employed to formulate some main energy-based 621 

EDPs, i.e., spectral input and hysteretic energy, spectral hysteretic to input energy ratio, and spectral 622 

energy modification factor. Maximizing the accuracy and minimizing the complexity of the predictive 623 

models were considered as two objectives of the multi-objective optimization procedure. Both 624 

structural and earthquake characteristics were included in the proposed mathematical models. 625 

Regarding each EDP, one equation with different coefficients was proposed for various soil types 626 

assuming that different soil types have linear relationships. The frequency of each input variable in the 627 

best-generated models was also presented to measure the importance of the variable. 628 

Finally, the capability of the proposed models was examined using several common performance 629 

metrics. The results indicate the accuracy and effectiveness of the proposed mathematical models in 630 

predicting the seismic energy demand spectra compared with some of the models presented in the 631 

literature.  632 

 633 

References 634 


Akiyama, H., 1985. Earthquake-resistant limit-state design for buildings, The University of Tokyo Press, Tokyo, 635 

Japan. 636 

Alavi, A.H., Aminian, P., Gandomi, A.H., and Esmaeili, M.A., 2011. Genetic-based modeling of uplift capacity 637 

of suction caissons, Expert Systems with Applications 10, 12608–12618. 638 

Alıcı, F.S., and Sucuoğlu, H., 2016. Prediction of input energy spectrum: attenuation models and velocity 639 

spectrum scaling, Earthquake Engineering and Structural Dynamics 45, 2137–2161. 640 

Amiri, G.G., Darzi, G.A., and Amiri J.V., 2008. Design elastic input energy spectra based on Iranian earthquakes, 641 

Canadian Journal of Civil Engineering 35, 635–646. 642 

ANSI/AISC, 2010. Specification for Structural Steel Buildings, American Institute of Steel Construction, 643 

Chicago, IL. 644 

Arroyo, D., and Ordaz, M., 2007. On the estimation of hysteretic energy demands for SDOF systems, Earthquake 645 

Engineering and Structural Dynamics 36, 2365–82. 646 

Babanajad, S.K., and Gandomi, A.H., Mohammadzadeh, D., Alavi, A.H., 2013. Numerical modeling of concrete 647 

strength under multiaxial confinement pressures using linear genetic programming, Automation in Construction 648 

36, 136–144.  649 

Benavent-Climent, A., and Lopez-Almansa, F., and Bravo-Gonzalez, D.A., 2010. Design energy input spectra for 650 

moderate-to-high seismicity regions based on Colombian earthquakes, Soil Dynamics and Earthquake 651 

Engineering 30, 1129–1148.  652 

Bertero, V.V., and Uang, C.M., 1988. Implications of recorded earthquake ground motions on seismic design of 653 

building structures, Research Report, UCB/EERC-88/13, University of California at Berkeley, Los Angeles, CA. 654 

Cabalar, A. F., and Cevik, A., 2011. Triaxial behavior of sand–mica mixtures using genetic programming. Expert 655 

Systems with Applications 38, 10358–10367.  656 

Chopra, A.K., 2012. Dynamics of structures: theory and applications to earthquake engineering, Fourth Edition, 657 

Prentice Hall, CA. 658 

Chou, C.C., and Uang, C.M., 2000 Establishing absorbed energy spectra – an attenuation approach,Earthquake 659 

Engineering and Structural Dynamics 29, 1441–1455. 660 

Deb, K., Pratap, A., Agarwal, S., and Meyarivan, T., 2002. A fast and elitist multi objective genetic algorithm: 661 

NSGA-II, IEEE Transactions on Evolutionary Computation6, 182–197. 662 

Decanini, L.D., and Mollaioli, F., 2001. An energy-based methodology for the assessment of seismic demand, 663 

Soil Dynamic and Earthquake Engineering 21, 113–137. 664 


Deniz, D., Song, J., and Hajjar, J.F., 2017. Energy-based seismic collapse criterion for ductile planar structural 665 

frames, Engineering Structures 141, 1–13. 666 

Dindar, A.A., Yalçın, C., Yüksel, E., Özkaynak, H., and Büyüköztürke, O., 2015. Development of earthquake 667 

energy demand spectra, Earthquake Spectra 31, 1667–1689. 668 

Fajfar, P., and Vidic, T., 1994. Consistent inelastic design spectra: hysteretic and input energy, Earthquake 669 

Engineering and Structural Dynamics 23, 523–537. 670 

Fajfar, P., Vidic, T., and Fischinger, M., 1989. Seismic demand in medium- and long-period structures, 671 

Earthquake Engineering and Structural Dynamics 18, 1133–1144. 672 

Fajfar. P., and Krawinkler. H., 1992. Nonlinear seismic analysis and design of reinforced concrete buildings, 673 

Elsevier Applied Science, New York. 674 

Federal Emergency Manage Agency (FEMA), 2000. Prestandard and commentary for the seismic rehabilitation 675 

of buildings. FEMA-356, prepared by the American Society of Civil Engineers, Washington, D.C. 676 

Ferreira, C., 2006. Gene expression programming: mathematical modeling by an artificial intelligence (Vol. 21). 677 

Springer. 678 

Gandomi, A. H., and Alavi, A. H., 2011. Multi-stage genetic programming: a new strategy to nonlinear system 679 

modeling. Information Sciences 181, 5227-5239. 680 

Gandomi, A. H., Babanajad, S. K., Alavi, A. H., and Farnam, Y. (2012). Novel approach to strength modeling of 681 

concrete under triaxial compression. Journal of Materials in Civil Engineering 24, 1132–1143. 682 

Gandomi, A.H., Alavi, A.H., Mirzahosseini, M.R., and Nejad, F.M., 2010. Nonlinear genetic-based models for 683 

prediction of flow number of asphalt mixtures, Journal of Materials in Civil Engineering 23, DOI: 684 

10.1061/(ASCE)MT.1943-5533.0000154. 685 

Gandomi, A.H., and Alavi, A.H., 2012a. A new multi-gene genetic programming approach to nonlinear system 686 

modeling. Part I: materials and structural engineering problems, Neural Computing and Applications 21, 171–687 

187.  688 

Gandomi, A.H., and Alavi, A.H., 2012b. A new multi-gene genetic programming approach to nonlinear system 689 

modeling. Part II: geotechnical and earthquake engineering problems, Neural Computing and Applications 21, 690 

189–201. 691 

Gandomi, A.H., and Roke, D.A., 2015. Assessment of artificial neural network and genetic programming as 692 

predictive tools, Advances in Engineering Software 88, 63–72.  693 

http://ascelibrary.org/journal/jmcee7
https://doi.org/10.1061/(ASCE)MT.1943-5533.0000154


Gandomi, A.H., Roke, D.A., and Sett, K., 2013. Genetic programming for moment capacity modeling of 694 

ferrocement members, Engineering Structures 57, 169–176.  695 

Gandomi, A.H., Sajedi, S., Kiani, B., and Huang, Q., 2016. Genetic programming for experimental big data 696 

mining: A case study on concrete creep formulation, Automation in Construction 70, 89–97.  697 

Gani, A., Siddiqa, A., Shamshirband, S.,and Hanum, F., 2016. A survey on indexing techniques for big data: 698 

taxonomy and performance evaluation, Knowledge and Information Systems 46, 241–284.  699 

Gharehbaghi, S., 2018. Damage controlled optimum seismic design of reinforced concrete framed structures, 700 

Structural Engineering and Mechanics 65, 53–68. 701 

Gharehbaghi, S., and Khatibinia, M., 2015. Optimal seismic design of reinforced concrete structures subjected to 702 

time–history earthquake loads using an intelligent hybrid algorithm, Earthquake Engineering and Engineering 703 

Vibration 14, 97–109. 704 

Gholizadeh, S., and Salajegheh, E., 2009. Optimal design of structures for time history loading by swarm 705 

intelligence and an advanced metamodel, Computer Methods in Applied Mechanics and Engineering 198, 2936–706 

49.  707 

Gupta, A.K., 1990. Response spectrum method in seismic analysis and design of structures. CRC Press, Boca 708 

Raton, FL. 709 

Hii, C., Searson, D.P., and Willis, M., 2011. Evolving toxicity models using multigene symbolic regression and 710 

multiple objectives, International Journal of Machine Learning and Computing 1, 30–35. 711 

Housner, G.W., 1956. Limit design of structures to resist earthquakes, in Proceedings, of the First World 712 

Conference on Earthquake Engineering, 5, 1–13. 713 

Kalkan, E., and Kunnath, S.K., 2007. Effective cyclic energy as a measure of seismic demand effective cyclic 714 

energy as a measure of seismic demand, Journal of Earthquake Engineering 11,725–751. 715 

Kayadelen, C., Günaydın, O., Fener, M., Demir, A., &Özvan, A. (2009). Modeling of the angle of shearing 716 

resistance of soils using soft computing systems. Expert Systems with Applications 36, 11814-11826.  717 

Khan, S.A., Shahani, D.T., and Agarwala, A.K., 2003. Sensor calibration and compensation using artificial neural 718 

network, ISA Trans 42, 337–352. 719 

Khashaee, P., 2004. Energy–based seismic design and damage assessment for structures, Ph.D. Dissertation, 720 

Department of Civil Engineering, Southern Methodist University, USA. 721 

Khatibinia, M., Gharehbaghi, S., and Moustafa, A., 2015. Seismic reliability-based design optimization of 722 

reinforced concrete structures including soil-structure interaction effects, Earthquake Engineering- From 723 


Engineering Seismology to Optimal Seismic Design of Engineering Structures, Chapter 11, Moustafa A (ed.), 724 

InTech, 267–304. 725 

Koza, J. R., 1990. Genetic programming: A paradigm for genetically breeding populations of computer programs 726 

to solve problems (Vol. 34). Stanford, CA: Stanford University, Department of Computer Science. 727 

Kuwamura, H., and Galambos, T.V., 1989. Earthquake load for structural reliability, Journal of Structural 728 

Engineering 115, 1446-1462. 729 

Manfredi, G., 2001. Evaluation of Seismic Energy Demand, Earthquake Engineering and Structural Dynamics 730 

30, 485–499. 731 

MATLAB, 2010. The language of technical computing, Math Works Inc. 732 

Metenidis, M.F., Witczak, M., and Korbicz, J., 2004. A novel genetic programming approach to nonlinear system 733 

modelling: application to the DAMADICS benchmark problem, Engineering Applications of Artificial 734 

Intelligence17, 363–370. 735 

Mirzahosseini, M.R., Aghaeifar, A., Alavi, A.H., Gandomi, A.H., and Seyednour, R., 2011. Permanent 736 

deformation analysis of asphalt mixtures using soft computing techniques, Expert Systems with Applications 38, 737 

6081–6100. 738 

Papadopoulos, V., and Giovanis, D.G., Lagaros, N.D., Papadrakakis, M., 2012. Accelerated subset simulation 739 

with neural networks for reliability analysis, Computer Methods in Applied Mechanics and Engineering 223–224, 740 

70–80. 741 

PEER Strong Motion Database. 2017; http://ngawest2.berkeley.edu/. 742 

Quinde, P., and Reinoso, E., Terán-Gilmore, A., 2016. Inelastic seismic energy spectra for soft soils: Application 743 

to Mexico City, Soil Dynamics and Earthquake Engineering 89,198–207.  744 

Sajjadi, S., Shamshirband, S., Alizamir, M., Yee, L., Mansor, Z., Manaf, A.A. et al. 2016, Extreme learning 745 

machine for prediction of heat load in district heating systems, Energy and Buildings 122, 222–227. 746 

Salajegheh, E., and Heidari, A., 2005. Optimum design of structures against earthquake by wavelet neural network 747 

and filter banks, Earthquake Engineering and Structural Dynamic 34, 67挑82. 748 

Searson, D., Willis, M., and Montague, G., 2007. Co-evolution of non-linear PLS model components, Journal of 749 

Chemometrics 21, 592–603.  750 

Searson, D.P., 2014, GPTIPS 2: an open-source software platform for symbolic data mining, (arXiv preprint 751 

arXiv:14124690). 752 

https://www.infona.pl/resource/bwmeta1.element.elsevier-14dffa12-d307-3f3e-a661-a4637cd14bf6/tab/jContent
https://www.infona.pl/resource/bwmeta1.element.elsevier-14dffa12-d307-3f3e-a661-a4637cd14bf6/tab/jContent
http://www.sciencedirect.com/science/article/pii/S0045782512000552
http://www.sciencedirect.com/science/article/pii/S0045782512000552
http://ngawest2.berkeley.edu/


Searson, D.P., Leahy, D.E., and Willis, M.J., 2010. GPTIPS: an open source genetic programming toolbox for 753 

multigene symbolic regression, in Proceedings of the International Multiconference of Engineers and Computer 754 

Scientists: Citeseer , 77–80.  755 

Sucuoğlu, H., and Nurtuğ, A., 1995. Earthquake ground motion characteristics and seismic energy dissipation, 756 

Earthquake Engineering and Structural Dynamics 24, 1195–1213. 757 

Tsompanakis, Y., and Topping, B.H.V., 2011. Soft computing methods for civil and structural engineering, Saxe-758 

Coburg Publications, Stirlingshire, UK. 759 

Uang, C.M., and Bertero, V.V., 1988. Implications of recorded earthquake ground motions on seismic design of 760 

building structures, Earthquake Engineering Research Center , Report No. UCB/EERC-88/13, University of 761 

California, Berkeley. 762 

Uang, C.M., and Bertero, V.V., 1990. Evaluation of seismic energy in structures, Earthquake Engineering and 763 

Structural Dynamics 19, 77–90. 764 

Vardhan, H., Garg, A., Li, J., and Garg, A., 2016. Measurement of stress dependent permeability of unsaturated 765 

clay, Measurement 91, 371–376. 766 

Walter, E., and Pronzato, L., 1997. Identification of parametric models: from experimental data , 767 

Communications and Control Engineering, Springer, 413 pp.  768 

Yazdani, H., Khatibinia, M., Gharehbaghi, S., and Hatami, K., 2016. Probabilistic optimum seismic design of 769 

reinforced concrete structures considering soil-structure interaction effects, ASCE-ASME Journal of Risk and 770 

Uncertainty in Engineering Systems, Part A: Civil Engineering 3, G4016004–1–12; DOI: 771 

10.1061/AJRUA6.0000880. 772 

Zahra, T.F., and Hall W.J., 1984. Earthquake energy absorption in SDOF structures, Journal of Structural 773 

Engineering 110, 1757–1772. 774 

Zhai, C., Ji, D., Wen, W., Lei, W., Xie, L., and Gong, M., 2016. The inelastic input energy spectra for main shock–775 

aftershock sequences Earthquake Spectra 32, 2149–2166. 776 


	A HYBRID COMPUTATIONAL APPROACH FOR SEISMIC ENERGY DEMAND PREDICTION
	S. Gharehbaghia, A.H. Gandomib, , and S. Achakpourc, M.N. Omidvard
	1. Introduction
	2. Seismic Energy Concept and Formulation
	3. Genetic Programming
	3.1. Multi-Gene Symbolic Regression
	4. Predicting Seismic Energy Demand Spectra Using MOGP
	5. Results and Discussion