key: cord-0909372-iv3x50mj authors: Kobaka, Janusz title: Principal Component Analysis as a Statistical Tool for Concrete Mix Design date: 2021-05-19 journal: Materials (Basel) DOI: 10.3390/ma14102668 sha: 898af92f092246d8b2d545ed343bdf9021bcdc02 doc_id: 909372 cord_uid: iv3x50mj With the recent and rapid development of concrete technologies and the ever-increasing use of concrete, adapting concrete to the specific needs and applications of civil engineering is necessary. Due to economic considerations and care for the natural environment, improving the methods currently used in concrete design is also necessary. In this study, the author used principal component analysis as a statistical tool in the concrete mix design process. Using a combination of PCA variables and 2D and 3D factors has made it possible to refine concrete recipes. Thirty-eight concrete mixes of different aggregate grades were analyzed using this method. The applied statistical analysis showed many interesting relationships between the properties of concrete and the content of its components such as the clustering of certain properties, showing dependence between the properties and the quantities of certain ingredients in concrete, and reducing noise in the data, which most importantly simplifies interpretation. This method of analysis can be used as an aid for concrete mix design. With the progression of civilization, a primary concern in civil engineering is building modern infrastructures for the industry and human housing needs. Concrete is still a commonly used material in construction all over the world [1] [2] [3] [4] , with its use in many applications and a variety of compositions and production technologies [5] . The concrete industry consumes the second greatest amount of natural resources [6] ; thus, proper concrete design is important for environmental [7, 8] and economic reasons [9, 10] . Decisive initiatives should be taken today towards optimizing mix designs by taking into account its environmental impact such that the use of natural resources can be reduced [7] . Concrete mix design is a complex process, and to achieve concrete with desirable properties, many methods have been developed. Nowadays, various types of by-products, such as fly ash, silica fume, and rice husk ash, have been widely used as pozzolanic materials in concrete [11] . Additionally, chemical admixtures are essential materials and the core technology for manufacturing modern concrete in high-tech fields [12] . However, the more components there are in concrete, the more complex the design process becomes. The difference between poor-quality and good-quality concrete rests not so much on the choice of ingredients but mainly on the proportions [13] . In 1968, Powers [14] noticed that, at the macro-scale, successive filling of voids by smaller particles can increase the packing density of the aggregate [15] . Increasing the packing densities of the aggregate and cementitious materials allows the manufacturer to produce a high-performance concrete [15, 16] . The most popular are methods derived from the three equations method [17, 18] , which allows a user to design concrete characterized by well-packed ingredients. Currently, the most popular mix design methods are the maximum density method, the fineness modulus method, the American Concrete Institute (ACI) mix design method, the Road Research Laboratory (RRL) method, and the Department of Energy (DOE) method [19] . There have been also some efforts to develop computer-aided approaches for mix design, such as an artificial neural network (ANN)-based method [11, 20] . Principal component analysis (PCA) is a powerful tool that finds internal correlations within a set of data and develops a statistical representation of these datasets [21] . Moreover, it is central to the study of multivariate data [22] . In PCA, a set of factor axes in ndimensional space is created by a rotation of the original set describing multidimensional objects in an attempt to achieve a simple structure [23] . The zero value in factor axes is the focal point represented by mean values of all variables. The main goals of PCA are to identify hidden patterns in a data set, to reduce the dimensionality of the data by removing the noise and redundancy in the data, and to identify correlated variables [24] . PCA has gained popularity by showing strong patterns especially in complex datasets [25] . The areas of application of PCA include biology [26, 27] , medicine [28, 29] , pharmacy [30] , climatology [31] , civil engineering [32, 33] , and many others. There were also some attempts to use PCA in concrete mix design; e.g., Deepika [34] used PCA variables to improve concrete mix design, while Boukhatem [35] used them to predict concrete properties. In this paper, the author proposes using a combination of PCA variables and 2D and 3D factors to refine the concrete design process. The data used for the analysis are based on the author's previous test results [36] . The concrete mixes used in the tests consisted of Portland Cement CEM I 32.5N manufactured in Kujawy cement plant located in Bielawy, Poland; three fractions of the aggregate, namely 0-0.5 mm, 0.5-2 mm, and 2-4 mm; and tap water (see Table 1 ). No additives were applied to the concrete to achieve test results based mainly on the influence of the aggregate graining on the concrete properties. The tested points from the experimental plan were plotted using three-dimensional coordinates [37] in relation to the percentage of specific fractions. 1 0 1570 0 358 189 20 1141 143 143 472 248 2 157 1417 0 397 209 21 1279 0 142 457 241 3 309 1235 0 394 207 22 0 1285 321 370 194 4 480 1121 0 384 202 23 167 1167 333 350 184 5 614 921 0 422 222 24 331 996 331 375 198 6 755 755 0 429 226 25 512 853 341 376 198 7 878 585 0 452 238 26 628 628 314 429 226 8 1007 432 0 488 257 27 810 491 327 410 216 9 1107 277 0 487 256 28 904 302 302 439 231 10 1225 136 0 478 251 29 1065 152 304 425 224 11 1362 0 0 492 259 30 1226 0 306 420 221 12 0 1480 164 380 200 31 0 1209 518 354 186 13 163 1303 163 360 190 32 168 1008 504 346 182 14 319 1118 160 398 210 33 344 860 516 341 180 15 479 958 160 405 213 34 522 696 522 343 181 16 617 771 154 419 221 35 670 502 502 379 200 17 751 601 150 403 212 36 817 327 490 378 199 18 850 425 142 474 249 37 948 158 474 403 212 19 1025 293 146 418 220 38 1084 0 465 431 227 The aggregate fractions 0-0.5 mm and 0.5-2 mm were assessed within a scale from 0 to 100%, with steps equal to 10%, and the fraction 2-4 mm was assessed within a scale from 0 to 30%, with the same steps (see Figure 1 ). The water-to-cement ratio was constant and equal to 0.53 for all 38 mixes. All of the components were mixed in a concrete mixer for 2 min starting from the moment the dosing process of the ingredients ended. During molding, the concrete was compacted for 1.5 min using a vibration table characterized by 50 Hz frequency. The concrete specimens were in the form of cubes that were 150 × 150 × 150 mm. Afterward, the specimens were cured for 28 days in laboratory conditions at a temperature of +20 • C and a relative humidity of over 90%. The aggregate fractions 0-0.5 mm and 0.5-2 mm were assessed within a scale from 0 to 100%, with steps equal to 10%, and the fraction 2-4 mm was assessed within a scale from 0 to 30%, with the same steps (see Figure 1 ). The water-to-cement ratio was constant and equal to 0.53 for all 38 mixes. All of the components were mixed in a concrete mixer for 2 min starting from the moment the dosing process of the ingredients ended. During molding, the concrete was compacted for 1.5 min using a vibration table characterized by 50 Hz frequency. The concrete specimens were in the form of cubes that were 150 × 150 × 150 mm. Afterward, the specimens were cured for 28 days in laboratory conditions at a temperature of +20 °C and a relative humidity of over 90%. The research program was divided into two stages. During the first stage, the properties of fresh mixes, such as consistency, apparent density, and air content, were tested. During the second stage, the properties of the hardened concrete, namely density, compressive strength, and splitting tensile strength, were examined. The test procedures were based on European standards (see Table 2 ). The research program was divided into two stages. During the first stage, the properties of fresh mixes, such as consistency, apparent density, and air content, were tested. During the second stage, the properties of the hardened concrete, namely density, compressive strength, and splitting tensile strength, were examined. The test procedures were based on European standards (see Table 2 ). The test results of the fresh concrete mix (see Table 3 ) showed that its consistency ranged from 4.5 s, which characterizes consistency V4, to 9.2 s, which characterizes consistency V3, according to the EN 206 standard. The apparent density ranged from 2090 to 2280 kg/m 3 , and the air content ranged from 2.5 to 9.0%. The test results for concrete in a hardened state showed that the apparent density ranged from 1996 to 2217 kg/m 3 , that the compressive strength ranged from 15.30 to 25.60 MPa, and that the splitting tensile strength ranged from 1.9 to 2.7 MPa (see Table 4 ). The compressive strength in relation to the percentage of the three aggregate fraction groups (see Figure 2 ) shows that concrete characterized by the highest values of compressive strength also contained the most aggregate, 2-4 mm (up to 30%), and that concrete characterized by the lowest values contained the finest aggregate, 0-0.5 mm (up to 50%); this also applied to splitting tensile strength (see Figure 3 ). In order to determine the number of factors used in PCA [38] , a scree plot of eigenvalues was constructed. One can see that the "elbow" of the graph where the eigenvalues appear to level off is found at eigenvalue 3, which means that factors to the left of this point should be retained as they are significant. The first two factors explain 74.35% of the variance, while the first three factors explain 84.47% of the variance (see Figure 4 ). Two or three factors can be visualized in 2D or 3D plots. In order to determine the number of factors used in PCA [38] , a scree plot of eigenvalues was constructed. One can see that the "elbow" of the graph where the eigenvalues appear to level off is found at eigenvalue 3, which means that factors to the left of this point should be retained as they are significant. The first two factors explain 74.35% of the variance, while the first three factors explain 84.47% of the variance (see Figure 4) . Two or three factors can be visualized in 2D or 3D plots. In the PCA analysis (see Table 5 ), the variables taken into account were concrete ingredients (designated as 1 to 5), the properties of the fresh concrete mix (designated as 6 to 8), and the properties of the hardened concrete (designated as 9 to 11). The variables characterized by the highest contributions of the three factors are marked with red in the table: in factor 1, they were cement, water content, and concrete density; in factor 2, they In the PCA analysis (see Table 5 ), the variables taken into account were concrete ingredients (designated as 1 to 5), the properties of the fresh concrete mix (designated as 6 to 8), and the properties of the hardened concrete (designated as 9 to 11). The variables characterized by the highest contributions of the three factors are marked with red in the table: in factor 1, they were cement, water content, and concrete density; in factor 2, they were aggregates 0-0.5 mm and 0.5-2 mm and air content; and in factor 3, they were consistency, aggregate 0.5-2 mm, and air content. In the PCA projection of the variables set in the 2D factor loading space (see Figure 5 ), one can see that variables 4 and 5 (cement and water content, see Table 5 ) were plotted along the same direction, which is justified because the water/cement ratio was equal for all concrete mixes in the experiment; thus, those variables are strongly correlated. In the PCA projection of the variables set in the 2D factor loading space (see Figure 5 ), one can see that variables 4 and 5 (cement and water content, see Table 5 ) were plotted along the same direction, which is justified because the water/cement ratio was equal for all concrete mixes in the experiment; thus, those variables are strongly correlated. Figure 5 . PCA projection of variables set in a 2D factor loading space (for the variable designations, see Table 5 ). Placing variables 4 and 5 in the same direction is an example of reducing the noise of the data using PCA. Variables 8, 9, and 10 (mix density, compressive strength, and concrete density, respectively) are strongly correlated with each other because their projections lie close to each other. These variables are also strongly correlated with variable 3 (aggregate 2-4 mm), which indicates that a high content of this aggregate is correlated with high densities of the fresh mix and the hardened concrete and high compressive strengths. Variable 7 (air content in the fresh mix) is almost directly located on the side Figure 5 . PCA projection of variables set in a 2D factor loading space (for the variable designations, see Table 5 ). Placing variables 4 and 5 in the same direction is an example of reducing the noise of the data using PCA. Variables 8, 9, and 10 (mix density, compressive strength, and concrete density, respectively) are strongly correlated with each other because their projections lie close to each other. These variables are also strongly correlated with variable 3 (aggregate 2-4 mm), which indicates that a high content of this aggregate is correlated with high densities of the fresh mix and the hardened concrete and high compressive strengths. Variable 7 (air content in the fresh mix) is almost directly located on the side opposite to variable 3, which means that a high content of the coarsest fraction (aggregate 2-4 mm) is correlated with low values of air content in the fresh concrete mix. PCA with object grouping in a two-dimensional space shows that most cases characterized by a compressive strength of 22 MPa or above (see Figure 6 ) and a splitting tensile strength over 2.5 MPa (see Figure 7 ) are located in the bottom left of the two charts. Variables 3, 8, 9, 10 , and 11 (see Figure 5 )-assigned to aggregate 2-4 mm, mix density, compressive strength, concrete density, and splitting tensile strength-are also located in this area of the chart. One can conclude that a high volume of the coarse aggregate is correlated with higher densities of the concrete in the fresh and hardened states and with higher compressive and splitting tensile strengths. Most cases characterized by a compressive strength of 16 MPa or below (see Figure 6 ) and a splitting tensile strength over 2.5 MPa are located in the bottom right of the two charts (see Figure 7 ). Variables 1, 4, and 5-assigned to aggregate 0-0.5 mm, cement, and water content-are also located in this area of the chart (see Figure 5 ). One can conclude that a high volume of fine aggregates is correlated with higher contents of water+cement paste because of the high specific area of very fine aggregates; however, due to the constant w/c ratio, it did not improve with regard to compressive and splitting tensile strengths. Variables 8, 9, and 10-mix density, compressive strength, and concrete density in the hardened state, respectively (see Table 5 )-are located at positions similar to those of the points of highest compressive and splitting tensile strengths (see Figures 8-10) . Variable 1-aggregate 0-0.5 mm-is located at a position on the chart similar to that of the points of lowest compressive and splitting strengths. terized by a compressive strength of 22 MPa or above (see Figure 6 ) and a splitting tensile strength over 2.5 MPa (see Figure 7) are located in the bottom left of the two charts. Variables 3, 8, 9, 10, and 11 (see Figure 5 )-assigned to aggregate 2-4 mm, mix density, compressive strength, concrete density, and splitting tensile strength-are also located in this area of the chart. One can conclude that a high volume of the coarse aggregate is correlated with higher densities of the concrete in the fresh and hardened states and with higher compressive and splitting tensile strengths. Most cases characterized by a compressive strength of 16 MPa or below (see Figure 6 ) and a splitting tensile strength over 2.5 MPa are located in the bottom right of the two charts (see Figure 7 ). Variables 1, 4, and 5-assigned to aggregate 0-0.5 mm, cement, and water content-are also located in this area of the chart (see Figure 5 ). One can conclude that a high volume of fine aggregates is correlated with higher contents of water+cement paste because of the high specific area of very fine aggregates; however, due to the constant w/c ratio, it did not improve with regard to compressive and splitting tensile strengths. Variables 8, 9, and 10-mix density, compressive strength, and concrete density in the hardened state, respectively (see Table 5 )-are located at positions similar to those of the points of highest compressive and splitting tensile strengths (see Figures 8-10) . Variable 1-aggregate 0-0.5 mm-is located at a position on the chart similar to that of the points of lowest compressive and splitting strengths. Table 5 ). Table 5 ). Materials 2021, 14, x FOR PEER REVIEW 10 of 13 Taking into account the third factor and adding the third dimension to the 2D chart (compare Figures 5 and 8 ) resulted in consistency being an important property of concrete, largely influencing the statistical model created using PCA. The contribution of consistency (variable 6) is high, at 66.2% (see Table 5 ). This phenomenon was not visible in the 2D chart (compare Figures 5 and 8) . In the 3D model (see Figure 8 ), cases characterized by consistency of 8.5 s or above were plotted at the top of the chart and cases characterized by consistency of 7 s or below were plotted at the bottom of the 3D chart (see Figure 11 ). The PCA provided in the experiment described above showed a strong tendency to group cases with similar properties. The positions of cases characterized by desirable properties, i.e., high compressive strength (see Figures 6 and 9 ), splitting tensile strength (see Figures 7 and 10) , or consistency (see Figure 11 ) are situated along the same direction as the variables that influenced the properties the most (see Figures 5 and 8) . A proper change in these values influences a change in the desirable properties of concrete. This is a tool useful for better understanding the concrete design process. This tool is also an excellent aid in refining the composition of a concrete mixture. (compare Figure 5 and Figure 8 ) resulted in consistency being an important property of concrete, largely influencing the statistical model created using PCA. The contribution of consistency (variable 6) is high, at 66.2% (see Table 5 ). This phenomenon was not visible in the 2D chart (compare Figure 5 and Figure 8 ). In the 3D model (see Figure 8) , cases characterized by consistency of 8.5 s or above were plotted at the top of the chart and cases characterized by consistency of 7 s or below were plotted at the bottom of the 3D chart (see Figure 11 ). Figure 11 . PCA with object grouping in a three-dimensional space on the basis of concrete composition in relation to properties. Consistency: red represents 8.5 s or above, and blue represents 7 s or below. The PCA provided in the experiment described above showed a strong tendency to group cases with similar properties. The positions of cases characterized by desirable properties, i.e., high compressive strength (see Figures 6 and 9 ), splitting tensile strength (see Figures 7 and 10) , or consistency (see Figure 11 ) are situated along the same direction as the variables that influenced the properties the most (see Figures 5 and 8) . A proper change in these values influences a change in the desirable properties of concrete. This is a tool useful for better understanding the concrete design process. This tool is also an excellent aid in refining the composition of a concrete mixture. The principal component analysis method was used as a concrete mix design tool to obtain the following conclusions: Clustered cases of certain properties were grouped together; i.e., cases characterized by high compressive and splitting tensile strength were plotted together.  A dependence between the properties and quantities of certain ingredients in concrete was observed; for instance, a high compressive strength corresponded to a high content of coarse aggregate fractions, and a low compressive strength corresponded to a high content of fine aggregate fractions. Figure 11 . PCA with object grouping in a three-dimensional space on the basis of concrete composition in relation to properties. Consistency: red represents 8.5 s or above, and blue represents 7 s or below. The principal component analysis method was used as a concrete mix design tool to obtain the following conclusions: • Clustered cases of certain properties were grouped together; i.e., cases characterized by high compressive and splitting tensile strength were plotted together. • A dependence between the properties and quantities of certain ingredients in concrete was observed; for instance, a high compressive strength corresponded to a high content of coarse aggregate fractions, and a low compressive strength corresponded to a high content of fine aggregate fractions. • Noise was reduced in the data, which simplified the interpretation of most of the important factors influencing the model: due to the water/cement ratio being constant in the experiment, these variables were plotted together on the chart; other correlated variables such as mix density and concrete density were plotted close to one another. Elements that influenced the model to a large extent were recognized; in factor 1, they were water and cement content and concrete density. • PCA was found to be useful as an aid for concrete mix design. • It is also an excellent aid in refining the composition of a concrete mixture with certain properties using a combination of PCA variables and 2D and 3D factors to refine the concrete design process. • It could also be useful for designing other types of concretes by relying on the test results of these concretes. Funding: This research received no external funding. Informed Consent Statement: Not applicable. The data presented in this study are available upon request from the corresponding author. A Durable Concrete Mix Design Approach Using Combined Aggregate Gradation Bands and Rice Husk Ash Based Blended Cement Modeling of Concrete for Nonlinear Analysis Using Finite Element Code ABAQUS An Overview of Fly Ash and Bottom Ash Replacement in Self Compaction Concrete Identification of Fracture Mechanic Properties of Concrete and Analysis of Shear Capacity of Reinforced Concrete Beams without Transverse Reinforcement Incorporating Environmental Evaluation and Thermal Properties of Concrete Mix Designs Moving towards Resource Conservation by Automated Prioritization of Concrete Mix Design A Comparative Cradle-to-Gate Life Cycle Assessment of Three Concrete Mix Designs Concrete: An Eco Material That Needs to Be Improved Optimum Mix Design of Recycled Concrete Based on the Fresh and Hardened Properties of Concrete A Computer-Aided Approach to Pozzolanic Concrete Mix Design Effect of Functional Superplasticizers on Concrete Strength and Pore Structure The New Concrete The Properties of Fresh Concrete Concrete Mix Design Based on Water Film Thickness and Paste Film Thickness Numerical Modelling of Concrete-to-UHPC Bond Strength Machine Learning Techniques in Concrete Mix Design Three Equations Method for Normal Concrete Mix Design A Comparative Study of Popular Concrete Mix Design Methods from Qualitative and Cost-Effective Point of View for Extreme Environment Predicting Strength of Recycled Aggregate Concrete Using Artificial Neural Network, Adaptive Neuro-Fuzzy Inference System and Multiple Linear Regression Statistical Analysis of the Links among Lunar Mare Soil Mineralogy, Chemistry, and Reflectance Spectra Principal Component Analysis, Second Edition Choosing the Right Type of Rotation in PCA and EFA Practical Guide To Principal Component Methods in R Pilbara Craton Soil as a Possible Lunar Soil Simulant for Civil Engineering Applications Practical Statistics for Field Biology Size Correction in Biology: How Reliable Are Approaches Based on (Common) Principal Component Analysis? Oecologia Principal Component Analysis Applications in COVID-19 Genome Sequence Studies Principal Component Analysis of Personalized Biomolecular Corona Data for Early Disease Detection Analysis of Distribution of Ingredients in Commercially Available Clarithromycin Tablets Using Near-Infrared Chemical Imaging with Principal Component Analysis and Partial Least Squares An Example of Principal Component Analysis Application on On the Effectiveness of Principal Component Analysis for Decoupling Structural Damage and Environmental Effects in Bridge Structures The Use of PCA and Signal Processing Techniques for Processing Time-Based Construction Settlement Data of Road Embankments Principal Component Analysis for Concrete Mix by Ranking Method Predicting Concrete Properties Using Neural Networks (NN) with Principal Component Analysis (PCA) Technique Influence of Fine Aggregate Grading on Properties of Cement Composite The Assessment of Fine Aggregate Pit Deposits for Concrete Production The Scree Test for the Number of Factors