key: cord-0270105-0oh3divh authors: Sizemore, Nicholas E.; Nogueira, Monica L.; Greis, Noel P.; Davies, Matthew A. title: Application of Machine Learning to the Prediction of Surface Roughness in Diamond Machining date: 2020-12-31 journal: Procedia Manufacturing DOI: 10.1016/j.promfg.2020.05.142 sha: c9785f80ba599e7cd9377de5312f2ff19d93f0f4 doc_id: 270105 cord_uid: 0oh3divh Abstract The manufacturing process for single-point diamond turning germanium (Ge) can be complex when it comes to freeform IR optics. The multi-variant problem requires an operator to understand that the machining input parameters and choice of tooling will dictate the efficiency of generating surfaces with the appropriate tolerances. Ge is a brittle material and exhibits surface fracture when diamond turned. However, with the introduction of a negatively raked tool, surface fracture can be suppressed such that plastic flow of the material is possible. This paper focuses on the application and evaluation of machine learning methods to better assist the prediction of surface roughness parameters in Ge and provides a comparison with a well-understood ductile material, copper (Cu). Preliminary results show that both classic machine learning (ML) methods and artificial neural network (ANN) models offer improved predictive capability when compared with analytical prediction of surface roughness for both materials. Significantly, ML and ANN models were able to perform well for both Ge, a brittle material prone to surface fracture, and the more ductile Cu. ANN models offered the best prediction tool overall with minimal error. From a computational perspective, both ML and ANN models were able to achieve good results with smaller datasets than typical for many ML applications—which is beneficial since diamond turning can be costly. Recent research has suggested that the application of machine learning (ML) methods may offer both improved predictive performance in the diamond turning of brittle materials such as germanium (Ge), as well as key insights into the underlying physics of the turning process, especially at the transition between the ductile and brittle regimes [1] [2] [3] . Ge is a staple material for IR optics. However, unlike other materials such as chalcogenide glass, Ge cannot be molded because it is a crystalline material with high melting temperature (938°C). It must therefore be manufactured using subtractive machining operations such as single-point diamond turning (SPDT) and diamond milling [4, 5] . Despite manufacturing limitations, Ge is often still the preferred material for optics applications due to its high refractive index (~4) in the infrared and low optical dispersion. Ge bends light very efficiently and functions very well as optical lenses and windows. Nearly 50% of Ge used today is in the optics industry for both IR lenses and fiber optics. Military interest for high quality thermal imagers and the commercial market is also growing [6] . Ge is diamond turnable. However, due to its brittle nature, the machining parameters, such as rake angle, must be controlled to maintain surface integrity. Negatively raked turning tools have successfully demonstrated the suppression of brittle fracture in SPDT by increasing the hydrostatic pressure locally where the material is being removed [7] . Single-crystal Ge is anisotropic and therefore will exhibit varying material behavior responses when diamond turned on axis. This crystal orientation effect was first demonstrated by Nakasuji et al., as the paper details that the propagation of damage is increased along the cleavage planes of a given Recent research has suggested that the application of machine learning (ML) methods may offer both improved predictive performance in the diamond turning of brittle materials such as germanium (Ge), as well as key insights into the underlying physics of the turning process, especially at the transition between the ductile and brittle regimes [1] [2] [3] . Ge is a staple material for IR optics. However, unlike other materials such as chalcogenide glass, Ge cannot be molded because it is a crystalline material with high melting temperature (938°C). It must therefore be manufactured using subtractive machining operations such as single-point diamond turning (SPDT) and diamond milling [4, 5] . Despite manufacturing limitations, Ge is often still the preferred material for optics applications due to its high refractive index (~4) in the infrared and low optical dispersion. Ge bends light very efficiently and functions very well as optical lenses and windows. Nearly 50% of Ge used today is in the optics industry for both IR lenses and fiber optics. Military interest for high quality thermal imagers and the commercial market is also growing [6] . Ge is diamond turnable. However, due to its brittle nature, the machining parameters, such as rake angle, must be controlled to maintain surface integrity. Negatively raked turning tools have successfully demonstrated the suppression of brittle fracture in SPDT by increasing the hydrostatic pressure locally where the material is being removed [7] . Single-crystal Ge is anisotropic and therefore will exhibit varying material behavior responses when diamond turned on axis. This crystal orientation effect was first demonstrated by Nakasuji et al., as the paper details that the propagation of damage is increased along the cleavage planes of a given Recent research has suggested that the application of machine learning (ML) methods may offer both improved predictive performance in the diamond turning of brittle materials such as germanium (Ge), as well as key insights into the underlying physics of the turning process, especially at the transition between the ductile and brittle regimes [1] [2] [3] . Ge is a staple material for IR optics. However, unlike other materials such as chalcogenide glass, Ge cannot be molded because it is a crystalline material with high melting temperature (938°C). It must therefore be manufactured using subtractive machining operations such as single-point diamond turning (SPDT) and diamond milling [4, 5] . Despite manufacturing limitations, Ge is often still the preferred material for optics applications due to its high refractive index (~4) in the infrared and low optical dispersion. Ge bends light very efficiently and functions very well as optical lenses and windows. Nearly 50% of Ge used today is in the optics industry for both IR lenses and fiber optics. Military interest for high quality thermal imagers and the commercial market is also growing [6] . Ge is diamond turnable. However, due to its brittle nature, the machining parameters, such as rake angle, must be controlled to maintain surface integrity. Negatively raked turning tools have successfully demonstrated the suppression of brittle fracture in SPDT by increasing the hydrostatic pressure locally where the material is being removed [7] . Single-crystal Ge is anisotropic and therefore will exhibit varying material behavior responses when diamond turned on axis. This crystal orientation effect was first demonstrated by Nakasuji et al., as the paper details that the propagation of damage is increased along the cleavage planes of a given 48th SME North American Manufacturing Research Conference, NAMRC 48 (Cancelled due to crystal orientation [8] . Since the 1990s, there has been a large amount of research on diamond machining of Ge. So-called ductile-to-brittle transitions have been discussed in literature as a critical transition point where the surface generated is crackfree [9, 10] , which is an oversimplification. In reality, subsurface damage begins to occur well before surface damage is evident. Also, an acceptable surface roughness may be obtained even when unacceptable subsurface damage or residual stress exist. Techniques such as Confocal Raman spectroscopy, Rutherford backscattering spectroscopy and transmission electron microscopy (TEM) have indicated that even with no visible surface damage, there can be significant lattice disorder from machining brittle materials [11, 12] . Owen et al. demonstrated the increased complexity involved for milling (interrupted cutting) of Ge [13] . Techniques such as SPDT on an ultra-precision machine tool are utilized to manufacture precision, rotationally symmetric optics with optical quality surface roughness, waviness, and form. Over the last decade, ultra-precision machine tools have also allowed the manufacture of more complex freeform optics [14] . Some major benefits of freeform surfaces are the reduction of components in a product and the ability to contain multiple reflective, transmissive or diffractive functions in a single optical element [15] . Freeform optics lack an axis rotational symmetry. These more complex geometries require multi-axis (>2) machine tools, such that greater control is needed in the synchronized motion of the translation and rotational movements. Also, the cutting parameters (such as rake angle in coordinated axis turning) are continuously changing making it more difficult to maintain surface integrity throughout a machining operation. For optical applications, maintaining key surface characteristics is an essential element of functionality. Optical components used in telescopes, spectrometers, energy concentrators (laser), and thermal imagers all require strict tolerance on surface roughness, waviness and form deviations. These tolerances are mostly dictated by the wavelength of light intended in operation and have a strong influence on their optical performance. In fact, the manufacturing process can have a lasting effect on the optical performance. An example case, provided by Aryan et al., showed that optical performance of a milled and turned surface are completely different even though the two surfaces had the same RMS height errors. Initially, their study utilized the Strehl Ratio (SR) to measure the performance. However, due to the complexity of the problem, the SR oversimplified the assumptions to isotropic surfaces, making the performance metric of limited utility [16] . Aryan et al. later used a novel method of combining SR and modulation transfer function (MTF) to provide a more robust estimation of optical performance and to better capture the effect of anisotropic features [17] . As noted above, prediction of surface characteristics and surface integrity in Ge is a complex multi-variate problem. The physics behind the behavior is only understood at a heuristic level. A first objective of this paper, in the absence of more complete physical modeling currently, is to evaluate the predictive capability of ML methods broadly against current analytical models, for both Ge and a more well-understood metal, copper (Cu). A second objective is to evaluate the relative performance of different ML models for surface roughness prediction in Ge and Cu. The different nature of these two materials, as represented by their ductile vs. brittle behavior when machined, suggested that it may be too difficult to identify a single ML algorithm that would perform well on both data sets. Third, this research explores how ML modeling yields insights that can inform both future analytical model development for diamond turning and suggest improvements in the selection of turning process parameters to reduce surface fracture. Experiments were conducted on a Moore Nanotechnology 350FG machine. Single-crystal diamond tools were used to machine single-crystal Ge and Cu workpieces. Measurements of the surface topography were collected using a coherence scanning interferometer (CSI). Since ML models typically require large data sets for training and testing, large numbers of experiments were conducted. To test ML for the prediction of a more wellunderstood process 78 experiments were conducted on oxygenfree high conductivity (OFHC) copper. For each set of machining parameters surface roughness data was collected. In SPDT of Cu, the surface roughness is dominated by the geometric replication of the tool nose in the workpiece surface. A more extensive data set was collected for diamond turning of Ge. In this case, 810 datasets were collected at different machining parameters, where for some parameters, roughness was dominated by geometric replication of the tool into the surface and for others material behavior was expected to dominate. Surfaces were face turned on the Moore Nanotechnology 350FG with a round nose single-crystal diamond tool. The tool is defined by the tool nose radius and the rake angle . Fig. 1a depicts a typical diamond turning operation, while Fig. 1b shows a close-up view of the tool-workpiece interaction. Table 1 below summarizes the tooling utilized for both the Cu and Ge studies conducted in this paper. The diamond tool used in the Cu experiments included only the tool no.1. The Ge experiments were conducted with tools no.2, no.3, and no.4. The ML models were trained to capture all eight input parameters available to predict surface roughness. These input parameters include: tool nose radius ; rake angle α; feed per revolution ; depth of cut ; cutting velocity ; feedrate ; chip thickness ; and spindle speed Ω. The position of the tool with respect to the radius of the workpiece was combined with the to compute the desired Ω. The feedrate is simply calculated by multiplying the and Ω resulting in a feed per minute. Chip thickness data is available for Ge but not for Cu data. In addition, tool nose radius and rake angle α are invariant for the Cu dataset and were therefore removed from the ANN models to avoid adding noise. Equations 1-3 were used to calculate the spindle speed, feedrate, and chip thickness, respectively. Target parameters , , and , reported in this paper, are areal International Organization of Standardization (ISO) height parameters that summarize the average surface roughness, RMS surface roughness, and maximum height of the areal surface in 3D, respectively. In a turning operation, a round-nosed tool cuts the surface, leaving circular cusp structures by direct geometric replication of the tool shape into the diamond turned surface (see Fig. 1b ). An operator could assume an expected surface roughness, Sa, based on the nose radius of the tool and the feed that is commanded. Here the arithmetic mean surface roughness is applied, which, for geometric replication, has a simple analytical relationship to the machining parameters (see Equation 4 ). Similarly, the RMS surface roughness can be calculated from Equation 5 (see Qu et al. [18] ). The estimate of the peak-to-valley can be approximated by Equation 6 [19] . Other analytical models, like the least square method, can also be alternatively used [20] . Geometric replication of the tool is, however, a very simplified description of the diamond turning process. Diamond machining involves complex, high-strain rate material flow and the physics of this flow can affect surface structure. In addition, the tool is not infinitely stiff, and the material may have an elastic rebound as the tool moves through the cut. Deflections and dynamics from the machine tool or cutter interacting with the workpiece will result in lower quality surfaces. Further, dimensional changes occur due to temperature variations. A ZYGO NexView CSI with a 50x objective (1x zoom) was used to collect surface roughness data. The parts under test were aligned using a motorized XY stage that has tip and tilt controls to adjust ±4 degrees. Post processing of the height maps included removing piston and a plane from the surface, as well as, filtering. The surfaces were filtered using a Fast Fourier Transform (FFT) high-pass with a period of 80 µm. An uncertainty analysis was completed using surface topography repeatability (STR). The STR analysis followed ISO 25178-604. Ten measurements were collected at each site of interest. The measurement settings included no averaging and subtracted the system reference. The reported uncertainty for five surfaces is given in Table 2 . The SiC CVD surface standard, which is an ultra-polished surface with subnanometer surface deviations, is the reference standard for the ZYGO NexView. The other four surfaces that were measured for the analysis were diamond turned on the Moore Nanotechnology 350FG and used a 5 mm tool nose radius with a -25 degree rake angle. These surfaces were selected to capture the effects of the cusp structure and surface fracture that is expected for Ge. Since single-crystal Ge is anisotropic, measurements were collected along two approximated and assumed directions <011> and clocked 22.5 degrees with respect to <011>. The <011> orientation is a cleavage direction that is known to leave more brittle fracture compared to other preferred orientation (slip directions) when diamond turned. The surfaces for Ge-2 and Ge-3 were measured along the clocked 22.5 degrees with respect to <011> and the cleavage direction <011>, respectively. These results conclude that the surfaces with inhomogeneous features, like surface fracture, can induce large uncertainties in the measurement process. Recently, data-driven models of physical processes have demonstrated high potential to predict machining parameters such as surface roughness more accurately than physics-based analytical or traditional statistical models [21] [22] [23] . Classical physics-based models require an in-depth physical understanding of the system of interest in order to develop closed-form mathematical models. Data-driven models offer a complementary approach that has been demonstrated to offer improved predictive capability, as well as insight into the validity of the assumptions underlying physics-based models. A fundamental understanding of the behavior of the physical system is essential, but an in-depth knowledge of the assumed probability distributions of the underlying process is not a prerequisite for data-driven ML models. Further, the results of these data-driven models can often shed light on incomplete specification of the physical models and identify regimes where the observed behavior departs from the accepted understanding of the physical model and the accumulated knowledge of domain experts, as we illustrate in this paper for diamond turning. Data-driven models prioritize automatic discovery of patterns in the observed data over prescribed model design using expert knowledge. Physics-based or "model-driven" approaches are motivated by understanding hypotheses about specific physical relationships and correlations among specific parameters. As such, they are not easily able to accommodate complexity and generally rely on simplifying assumptions. In contrast, data-driven approaches like ML typically use measurement data to build (i.e. train or learn) a model without any informing knowledge of the underlying physics. The types of ML algorithms differ in their approach, the type of input and output data, and the type of task or problem that they are intended to solve. Supervised ML infers a functional relationship from labeled training data consisting of a set of training examples, i.e. output labels are known and used to guide the learning process. Unsupervised learning builds a model from a set of data which contains only inputs and no desired output labels. Unsupervised learning algorithms are often used to find structure in the data, such as clusters, by grouping inputs by similarity, or aim at reducing data dimensionality to make a problem more tractable. Among the most popular data-driven approaches, especially for machining applications, are artificial neural networks (ANN), support vector machines (SVM), decision trees (DT), and random forest regression (RF). Artificial neural networks (ANNs) are a family of ML models based on biological neural networks which are used to estimate complex relationships between inputs and outputs [24, 25] . ML models, like ANN, make predictions or decisions without being explicitly programmed to perform the tasks. Like the human brain, ANNs are modeled as a stylized web of interconnected nodes or artificial neurons. The networks are composed of an input layer referred to as features, one or more hidden layers that process the data, and an output layer that provides one or more data points referred to as the target output, as illustrated in Fig. 2 . "Deep(er)" learning references the larger number of hidden layers, typically more than two. In feedforward networks, the flow of information takes place in the forward direction; the inputs are used to calculate intermediate functions in the hidden layers which are in turn used to calculate the target outputs. The artificial neurons are the computational building blocks of the ANN. An ANN transforms input data (i.e. features) by applying a nonlinear function to a weighted sum of the inputs. Each neuron receives inputs from several other neurons, multiplies them by assigned weights, adds them and passes the sum to one or more neurons in feedforward fashion. Some artificial neurons might apply an activation function to the output before passing it to the next variable. Learning is accomplished through iterative optimization of the loss function to adjust the weights in order to converge toward the target output(s). The Adam (Adaptive Moment Estimator) optimizer implemented here, one of the more popular gradient descent optimization algorithms, calculates the individual adaptive learning rate for each parameter from estimates of first and second moments of the gradients. Adam is typically used in the case of noisy gradients or gradients with high curvatures. Support vector machines (SVM) are supervised ML models that are used largely for classification, but also prediction [26] . Like ANNs, SVMs infer a function from labeled training data consisting of a set of training examples of paired inputs and outputs. The objective of the SVM algorithm is to find a hyperplane in an N-dimensional space where N is the number of input features that distinctly classifies the data points. For example, binary classification is performed by finding the hyper-plane that best differentiates between two classes, i.e. maximizes the margin between the hyperplane and the support vectors, or closest values to the classification margins, illustrated in Fig. 3 . The use of kernels can transform linearly inseparable problems into linearly separate ones. Decision trees are a nonparametric supervised learning method used for both classification and prediction [27] . As its name implies, output of a decision tree analysis is represented by a tree-like structure shown in Fig. 4 . The algorithm breaks a given dataset into smaller and smaller subsets which form the branches and leaves of the tree. While learning in ANNs is implemented through iterative optimization of the weights at the nodes of the network, learning in decision trees is implemented by inferring decision rules from input features at the decision points in the tree structure. ANNs are referred to as "black box" models because the underlying relationships between parameters or features are not revealed during the training process. In contract, decision trees are sometimes referred to as "white box" models because the tree-like structure that is produced provides insight into what is happening inside the model at the branching points of the tree. A meta-algorithm called boosting is often used in conjunction with ML methods like decision trees to improve performance. Boosting is a method of converting a set of weak learners, methods with a low likelihood of producing the correct target values (less than 50%), into stronger learners. The AdaBoost (AB) meta-algorithm, chosen for these experiments, works by putting more weight on difficult to classify instances and less on those already handled well. Random forest (RF) consists of a large number of individual decision trees that operate as an ensemble [28, 29] . RF can be thought of as a committee of decision trees where, in the case of classification the target output is derived from voting, while for prediction the target output is achieved by averaging. The underlying idea is to combine many different decision trees, built with different bootstrapped samples of the original dataset, and a random (but prespecified) number of features as shown in Fig. 5 . RF methods are a way of combining multiple deep decision trees that are trained on different parts of the same training dataset, with the goal of reducing the variance. RF methods typically avoid over-fitting. Individual decision trees that grow very deep tend to overfit, and thus exhibit low Generally, an ensemble of a large number of uncorrelated decision tree models will outperform any of the individual models in the ensemble. As with decision trees, RF methods provide some insight into the relative importance of each feature on the predicted output. Implementation of various ML approaches follows a typical workflow, illustrated in Fig. 6 , consisting of five steps that are applied to model a problem. Data Collection (step 1) gathers a set of samples, each containing values for input variables describing the problem being modeled and-for use with ML supervised learning algorithms-the corresponding numeric output target values associated with these inputs in the case of regression, such as performed in this paper, or categorical labels for classification. Data Preprocessing (step 2) focus on cleansing and formatting the dataset to be acceptable to ML algorithms through feature selection, feature transformation, and feature extraction or engineering. Data cleansing consists in removing or repairing incorrect or missing data, reducing noise, and/or applying data augmentation techniques to avoid the need for additional data collection and to create a more balanced dataset, particularly important in classification tasks. Analysis of individual features can lead to removal of unnecessary features, while transforming the remaining features by normalization or scaling can help improve the ML algorithm accuracy. Extraction of new features from the input data is another technique to engineering a dataset that may be better suited to the ML method. The ultimate goal of a ML method is to create a model capable to perform well on new data it has not "seen" during the learning phase. For this reason, in step 2, the resulting, clean dataset is split into nonoverlapping training and testing sets. During step 3, Model Training, adequate parameters are selected for the learning algorithm which is then executed to find patterns in the training data that map the input features to the output target. The product of this step is an ML model that captures these patterns and can produce the correct target output when injected with a new input sample. In Model Performance Evaluation (step 4) the model generated is evaluated by measuring its responses to the testing dataset using performance metrics selected for the specific problem. Model hyperparameters may need to be modified or tuned if results are not yet as accurate as expected. In this case, the process goes back to step 3, as the model must be re-trained using the modified parameters. Lastly, in step 5, Final Model Production, the final, tuned model is implemented and deployed to the production environment to operate with live data. In this work, the Cu and Ge datasets contain features with varying magnitudes, units and range, as listed in Table 1 . If a dataset feature is big in scale compared to others, then in ML algorithms based on distance measures, e.g. SVM, this big scaled feature may become dominating and skew the model results. Feature scaling, or standardization, is a type of feature transformation applied to individual features during the data preprocessing step that normalizes the data to a particular range. This helps to improve algorithmic accuracy and to speed up calculations, and can be safely applied when the scale of a feature is irrelevant or misleading; however, normalization should not be applied if the scale is meaningful. Standardization replaces feature values by their scores by linearly transforming the data to have zero mean and standard deviation of 1. scores are calculated for each individual feature by Equation 7 , where is the feature mean and is the feature standard deviation. The input features of the Cu and Ge datasets have been standardized to ensure good results for the SVM models and the same standardized data was utilized in all the experiments for comparison purposes. The target output was not standardized as the magnitude and range of , , and surface roughness measurements are important to our analysis. ML experiments were implemented in the programming language Python v.3.7.3, utilizing Keras v.2.2.5 neural network API to access the TensorFlow library in the backend, and various regression algorithms of the Scikit-learn v.0.20.3 ML library. Details of the experiments are provided next. The task of predicting surface roughness in the Cu and Ge was tackled by performing experiments with four classic ML algorithms: Decision Tree, Random Forest, AdaBoost, and Support Vector regressors, and a set of ANN ML architectures differing in the number of hidden layers and neurons per layer (Fig. 7) . ML algorithms have a number of hyperparameters that require fine-tuning to reduce the generalization error and achieve the best possible model. For example, "max depth" is the hyperparameter that controls the maximum number of children nodes that can grow out from a decision tree before it is cutoff. We experimented with four decision trees by setting their max depth to 2, 5, 8, and 11. This hyperparameter should also be tuned when applying the Random Forest and AdaBoost algorithms as they are ensembles of Decision Trees. We performed two separate experiments with each of these algorithms by setting max depth to 8 and 11, since these methods are not susceptible to over-fitting. Another tuning hyperparameter is the maximum number of estimators, "n_estimators", before boosting is terminated in AdaBoost, and the maximum number of trees in the forest, also called "n_estimators" in Random Forest. We used 100 estimators for both algorithms. A Support Vector Regressor (SVR) uses a set of mathematical functions, its "kernel", to take lower dimensional data as input and transform it into a higher dimensional form, which will determine the hyperplane that enables prediction of continuous values, i.e. the target output. We experimented with three general-purpose SVR kernels: Gaussian Radial Basis Function (RBF), Sigmoid (Sig), and Polynomial (Poly). A strong advantage of RBF is that it requires no prior knowledge of the data. The Sigmoid kernel is customarily used as a proxy for neural networks, while the Poly kernel is popular in image processing and was added as a baseline for comparison with the other kernels. The RBF kernel has two free parameters: the regularization parameter C set to 100, and 0.1 used for the epsilon value that safeguards points within epsilon distance from the actual value from suffering any penalty during the model training phase. For the Poly kernel, C was set to 1000 and epsilon was kept as 0.1. Additional free parameters for the Poly kernel are coef0, the independent term in the kernel function, which was set to 1; and the degree of the polynomial kernel function set to 3. For the Sigmoid kernel, coef0 was set to -3 while epsilon was kept as 0.1. Selecting the ANN algorithm and configuration that best fits a dataset is an open research question that is continuously being investigated by many researchers. Among a growing number of ANN algorithms and optimizers, we have selected the Adam optimizer for our experiments given that is it straightforward to implement, computationally efficient, appropriate for problems with noisy data, and requires little tuning of hyperparameters. We tested a number of feedforward ANN configurations, i.e. ANNs with different numbers of hidden layers and different numbers of neurons per layer. We report results for six ANN configurations. A basic ANN consisting of a single hidden layer with a small number of neurons is used as a "Baseline". ANN configurations named "Wider" have a single hidden layer but contain a larger number of neurons in this layer. The "Deeper" ANN configurations have three hidden layers and different number of neurons in these layers. The activation function applied to the hidden layers of the ANNs was the rectified linear unit (ReLU), while a linear activation function was applied to the output layers. The loss function utilized for minimizing the loss of the ANNs by optimizing its weight parameters was the Mean Squared Error. The Cu dataset contains only 78 samples, a small dataset by ML standards. Thus, the data was randomly split by separating 80% of the samples (62 samples) for training the model and the remaining 20% (26 samples) for testing. The larger Ge dataset has 810 samples and was also randomly split in 80% (648 samples) for training and 20% (162 samples) for testing to allow for better comparison of accuracy results against those from Cu data. There are multiple ways to define the performance criteria for a ML model and a number of metrics are typically used. For regression tasks, predictive capability is typically evaluated using metrics that measure how close predicted values are to the actual (true) value. To evaluate the predictive capability of ML for surface roughness in diamond turned Ge and Cu we compute the explained variance score, root mean squared error, mean absolute error, and 2 score, as described below. In Equations 8-12, ̂ corresponds to the value predicted and is the true value expected for the i-th example. ȳ is the feature mean and is the variance, i.e. the square of the standard deviation. The explained variance regression score (EVS) is a statistical measure, defined in Equation 8, of the proportion of variance in a given dataset which is accounted by a regression model. The mean absolute error (MAE) measures the average of the absolute difference between each true value and the prediction. MAE corresponds to an absolute measure of fit and is computed by Equation 9. Root mean squared error (RMSE) is computed as the square root of the average of the squared difference between each prediction and its true value (Equation 10). Squaring the differences between predictions and true values makes RMSE give higher weight to larger outliers, a useful property for data analysis. Similarly to MAE, RMSE is in the same units as the target. The maximum residual error (Max Error) measures the worst-case error between the predicted (target) value and the actual, true value expected, a useful property when very high errors can result in catastrophic or expensive events. R-squared ( 2 ), or coefficient of determination, is the statistical measure of the proportion of the variance in the dependent (target output) variable that is predictable from the independent (input features) variables. Typically 2 values vary between 0 and 1. 2 equals to 0 indicates that the model is not capable of predicting the target from the given input features; on the other hand, 2 equals to 1 means that the target output can be predicted without error from the input features. In rare cases, when the selected model fit is worse than a horizontal line 2 values can be negative. Thus, a negative value indicates that the mean of the data is a better predictor than the selected model. Performance metrics were computed for both the training and test phases of model building. A ML model "learns" a system's behavior from input-output pairs of samples it "sees" processes during the training phase. The model accuracy is evaluated during the testing phase by measuring the error of its predictions when presented with input samples it has not seen before and for which the corresponding target output value is also unknown to the model. Model evaluation results utilizing these metrics are presented in the next section. For each prediction target outcome, i.e. , , and , the classic ML algorithms were tested by building 1000 models per algorithm and per target, i.e. executing 1000 runs of each algorithm for each target, and averaging the results obtained for each evaluation metric. Each model was trained on data obtained by randomly splitting the (cleansed and standardized) dataset in 80% for training and 20% for testing sets. At each run, the same training input data and target output was used to train each of classic ML algorithms. Experiments were also performed for the six different ANN configurations, as described above. In the ANN experiments, the number of epochs was set to 300 with batch size of 1. Tables 3, 4 and 5 present the average results of 1000 runs for the classic ML algorithms for , , and , respectively. Numeric results for both training and testing are provided for each of four metrics. A score of 1 for EVS indicates the model is capable of fully capturing the variance in the data; thus, the best models based on these criteria are those closer to 1. For metrics measuring the error between prediction and true values the best-case scenario is a score as close to 0 as possible. ML model performance is measured on how well the model can predict new data, i.e. the test set. Our criterion for the best model for predicting each surface roughness target is the lowest RMSE score among test results. A closer inspection of the numeric results in Tables 3-5 shows that the model with the lowest RMSE test results also score as the best or close to the best model in the other three scoring criteria. Random Forest, with maximum depth of 8, outperforms all other models when predicting Sa and Sz for the Cu dataset, while AdaBoost, maximum depth of 11, was the best model for predicting Sq. Table 6 presents the results for the ANN configurations tested. As with the classic ML model results, the lowest RMSE score on the Cu test set is the criterion set for selecting the best Table 3 : Sa average prediction results of 1000 runs for copper dataset. model. Even though all models are good predictors of the target Sa surface roughness, as indicated by their 2 scores, the "Deeper" ANN models achieve better fit to the data. For surface roughness Sa, the "Deeper" ANN with the larger number of neurons in its three hidden layers, i.e. 50, 25, and 12 neurons, outperforms all the other ANN configurations, as well as the best model selected from the classic ML algorithms, i.e. Random Forest with max depth of 8. Therefore, among our experiments with the Cu data, the best model for predicting Sa is the "Deeper" ANN with configuration [50-25-12-1]. All ML and ANN models also outperform the analytical model with respect to RMSE. Fig. 9 shows that the losses, based on MSE for the "Deeper" ANN, gradually decrease to a steady state within 50 epochs. The two plots in Fig. 10 allow for a direct comparison between the absolute error for Sa values predicted by the best "Deeper" ANN versus the absolute error for analytical Sa values (Eq. 4) as a function of fr, and shows that the ANN model has a lower absolute error. Figure 11 : Q-Q plot between observed Sa and predicted Sa by the best "Deeper" ANN model for the Cu dataset The Q-Q (quantile-quantile) plot in Fig. 11 confirms the close similarity between the observed and predicted distributions. A graphic visualization of how closely the "Deeper" ANN model predicts when compared to the analytical model (Eq. 4) is presented in Fig. 12 . The graph insert provides a close-up view of the prediction error for very low values of . Tables 7-9 contain the average results for 1000 runs of the classic ML algorithm experiments for predicting , , and , respectively, for Germanium. The evaluation metrics are the same as those for the Cu models, and our selection of best model for Ge is, again, based on the lowest RMSE score. A strong indication of the power of the ML technology is to verify that the same ML algorithms selected as the best model for the three surface roughness targets for Cu repeat their performance with the Ge dataset. That is, Random Forest with max depth of 8 has the lowest RMSE for and , and AdaBoost with max depth of 11 is the best model for for Ge data. While the RMSE scores for and for both Cu and Ge are in acceptable ranges and similarly scaled, there is a large discrepancy between the RMSE scores for of the two materials with the scores for Ge being much higher than those of Cu, and away from the acceptable range. The root cause for such large discrepancy is currently unknown. More research and measurement fieldwork are required to investigate whether the values collected for the Ge dataset are skewed by measurement or other errors. Alternatively, the large deviation in the performance in Ge could be attributed to the inhomogeneous surface fracture of Ge during SPDT. Such surface fracture is characterized by spatial randomness that is not observed by the models since no image data is involved in the training currently. Results for all ANN model configurations tested with the Ge data are shown in Table 10 . Similar to the results obtained for Cu, the best model among all ANNs tested is the "Deeper" ANN with configuration [50-25-12-1]. Still more encouraging, is the fact that this ANN configuration has a much lower RMSE test score than that of the best model found for the Ge data with the Random Forest with max depth 8. Thus, the "Deeper" ANN with configuration [50-25-12-1] is the overall best model for predicting Sa for both Ge and Cu. Fig. 13 shows that the losses, based on MSE for the "Deeper" ANN, gradually decrease to a steady state within 100 epochs. The Q-Q plot of Fig. 14 shows the correlation between the observed and ANN predicted values, while suggesting a few possible outliers, that require further investigation. Fig. 15 presents strong evidence of the performance superiority of the "Deeper" ANN model in predicting for the Ge dataset over the geometric analytical model as the absolute errors of the predictions are significantly lower than the absolute errors of analytical by Eq. 4. The superior performance of the "Deeper" ANN model in predicting target parameter (non-solid markers) over the analytical model (continuous curve) when compared with observed values (solid markers) is shown in Figure 16 . The ANN model significantly underperforms at higher values of . Results are shown for three values of = 0.5, 1.0 and 5.0. The analytical model also significantly underperforms the "Deeper" ANN model for smaller values of confirming that surface roughness is dominated by geometric replication of the tool into the surface, especially at lower values of , which the analytical model is not able to capture. Fig. 17-19 provide a comparison of the observed, predicted and analytical Sa values for the three cases of tool geometry ( equal to 0.1, 1.0 and 5.0 mm) over a range of . While the "Deeper" ANN model used to predict the output included all eight of the aforementioned input parameters, Fig. 17-19 display predictive errors as a function of only. These figures provide graphical insights into the quality of surface finish based on tool geometry and feed. Over the experimental range of , Fig. 17 -19 display which observed values are associated with surface fracture. As can be seen, surface fracture was observed much sooner in feed when the tool nose radius is small. Experiments with tool nose radius = 0.5 mm experienced surface fracture during turning at feeds as low as 1.5 µm/rev while no fracture was observed at the larger tool nose radius = 5.0 mm, = -25 deg, = 2 m/sec, and = 25. In addition, the introduction of surface fracture is associated with increased error between the observed Sa and the analytical model. Data-driven ML models have become more useful predictive tools as modern computing power has accelerated, and are increasingly part of the data science toolkit for understanding the theory and practice of machining. This paper demonstrates the predictive capability of ANN methods and four classic ML methods (Random Forest, Decision Trees, AdaBoost and Support Vector Machines) to predict surface roughness during diamond turning of both Ge and Cu. Both analytical and theoretical models such as finite element analysis (FEA) provide an excellent tool when modeling the plastic flow of ductile materials like Cu, for example during chip production. However, these models have demonstrated shortcomings when modeling brittle materials like Ge that experience fracture or other random material defects during turning. The results presented in this paper suggest that both ML and ANN methods are capable of addressing these shortcomings. First, over the range of all eight input parameters studied, , , α, , , , and Ω, both ML and ANN models offer significant improvements in the prediction of surface roughness for Germanium when compared against analytical models. These ML and ANN models also offer slight improvement over analytical models in predicting surface roughness for Cu. However, the most significant improvements are attained for Ge which is well-known to experience brittle fracture during diamond turning. Second, and surprisingly given the differences in material structure and dynamics between Ge and Cu, the best performing predictive models for the three ISO standard surface finish parameters , and were identical for both Ge and Cu. Among the ANNs, the Deeper [50-25-12-1] ANN model exhibited the best overall fit with respect to RMSE and 2 for both Ge and Cu when predicting surface roughness parameter . Similarly, among the classic ML models, Random Forest with max depth 8, AdaBoost with max depth 11, and Random Forest with max depth 8 achieved the best accuracies when predicting surface roughness parameters , , and , respectively-for both Ge and Cu. Finally, this research has yielded several key insights that can inform both future analytical model development for diamond turning and suggest improvements in the selection of turning process parameters, especially for Ge. Future work will incorporate dynamometer force measurements and surface classification data into ML prediction models for , , and . Inprocess tool wear prediction system based on machine learning techniques and force analysis Predictive modeling of surface roughness and tool wear in hard turning using regression and neural networks A nested-ANN prediction model for surface roughness considering the effects of cutting forces and tool vibrations A review of the precision glass molding of chalcogenide glass (ChG) for infrared optics Experimental study on the ultraprecision ductile machinability of single-crystal germanium Germanium Properties, History and Applications. The Balance Ductile-Brittle Transition at Large Negative Tool Rake Angles Diamond turning of brittle materials for optical components Ductile-regime machining of germanium and silicon Single-point diamond machining of glasses Diamond milling of an Alvarez lens in germanium Spatial variations in stress and crystal quality in diamond turned ZnSe surfaces measured by Raman spectroscopy The mechanics of milling of germanium for IR applications Ultraprecision diamond machining of freeform optics Potential benefits of free-form optics in on-axis imaging applications with high aspect ratio Predicitive models for the Strehl ratio of diamond-machined optics Simple methods for estimating the performance and specification of optical components with anisotropic mid-spatial frequency surface errors Analytical surface roughness parameters of a theoretical profile consisting of elliptical arcs Handbook of Surface and Nanometrology -2 nd Edition A surface roughness prediction model for hard turning process Machine learning in manufacturing: advantages, challenges, and applications. Production and Manufacturing Research Smart Machining Process Using Machine Learning: A Review and Perspective on Machining Industry A comparison of machine learning methods for cutting parameters prediction in high speed turning process Prediction of surface roughness in the end milling machining using Artificial Neural Network Modeling of manufacturing processes with ANNs for intelligent manufacturing Predictive modeling of surface roughness in lenses precision turning using regression and support vector machines Fault Diagnosis of Single Point Cutting Tool through Vibration Signal Using Decision Tree Algorithm Comparison of random forest, artificial neural networks and support vector machine for intelligent diagnosis of rotating machinery A Comparative Study on Machine Learning Algorithms for Smart Manufacturing: Tool Wear Prediction Using Random Forests This work was funded by the North Carolina Consortium for Self-Aware Machining and Metrology (CSAM) and the US National Science Foundation (NSF-Grant CMMI-1437225). The authors greatly acknowledge the financial support of the NSF.