key: cord-0077943-6koabsko authors: Chaves, Carlos; Gosavi, Abhijit title: On general multi-server queues with non-poisson arrivals and medium traffic: a new approximation and a COVID-19 ventilator case study date: 2022-05-10 journal: Oper Res Int J DOI: 10.1007/s12351-022-00712-2 sha: 813ec3b4de0a36152d878cea45918bf8b702dae2 doc_id: 77943 cord_uid: 6koabsko We consider the multi-server, single-channel queue, i.e., a G/G/k queue with k identical servers in parallel, under the first-come-first-served discipline in which the inter-arrival process is non-Poisson, the service time has any given distribution, and traffic is of medium intensity. Such queues are common in factories, airports, and hospitals, where the inter-arrival times and service times are typically not exponentially distributed, but rather have double-tapering distributions whose probability density functions taper on both sides, e.g., gamma, triangular etc. For these conditions, a new closed-form approximation based on only the mean and variance of the two inputs, the inter-arrival and service times, is presented. Determining distributions of inputs typically requires additional human effort in terms of histogram-fitting and running a goodness-of-fit test, which is avoided here. The new approximation is tested on a variety of scenarios and its performance is benchmarked against simulation. Further, the new approximation is also implemented on a ventilator case study from the recent COVID-19 pandemic to demonstrate its utility in optimizing server capacity. The approximation provides errors typically in the range 1–15% and 31% in the worst case. In systems where data change rapidly and decisions must be made quickly, this approximation will be particularly useful. The queue that forms in front of a parallel set of griding machines for jobs that arrive from heat treatment in a manufacturing plant For the so-called M/G/k queue, which is a well-studied multi-server queue with k servers, the arrival process is Poisson and the service time can have any given distribution. Existing approximations from the literature have been known to work well for M/G/k queues (Whitt 1993) , regardless of the traffic intensity. However, the M/G/k model is not applicable when the exponential distribution does not hold for the inter-arrival time. The following evidence points to real-world systems where the M/G/k model does not work. In manufacturing systems, the inter-arrival time for a job is often gamma distributed (Benjaafar et al. 2004) , while the service time (production time) can have the gamma distribution in case the machine is failure-prone that leads to high variability (Das and Sarkar 1999) . The material in the "Appendix" of the book of Baker and Trietsch (2013) clearly states that the production time on machines is not likely to have the exponential distribution. See also Burgin (1975) , Muralidhar et al. (1992) , and Savsar and Choueiki (2000) , which provide extensive evidence of the inter-arrival times and service (production) times carrying the gamma distribution in manufacturing systems, rather than the exponential distribution. For airports, empirical evidence suggests that inter-arrival times commonly have the gamma distribution (Khadgi 2009; Suryani et al. 2010) . For the ID checking queue at a TSA security line or at other service counters in an airport, one typically encounters a human server, whose service time is often modeled via the triangular distribution that approximates the beta distribution (Johnson 1997) . Finally, Williford et al. (2020) make the case for using the gamma distribution rather than the exponential for the length of stay in a hospital during a serious illness. Thus, clearly, there is a need for studying multi-server queues where the inter-arrival time is not exponentially distributed and the service time is either gamma or triangular (i.e., Assumption A6). Need for approximations When distributions of inter-arrival and service times are available, under restrictive assumptions on the nature of the system, tedious analytical procedures leading to closed-form approaches, involving Laplace Fig. 3 The queue in a hospital that needs ICU beds is typically virtual within the computer system, but arriving patients assemble in the waiting areas while they are triaged by a nurse transforms (Langaris 1986; Eckberg 1977) , embedded Markov chains (Nadarajah 2008) , or phase-type distributions (Altiok 2012), can also be used for performance evaluation. When distributions are available, an alternative route is to use discrete-event simulation software (Law 2014), although the latter requires expensive software and becomes sluggish as k increases beyond 10. After simulation became a popular approach in the 1980s, research interest in closed-form approaches waned. When the analyst is able to derive the distributions of the inputs and has access to a simulation software, or is able to use exact analytical procedures, the approximation suggested in this paper will not be necessary. However, in the real world, means and variances of inter-arrival and service times (inputs) can be estimated with less effort; it is in those circumstances that an approximation based on the mean and variance of the inputs, such as the one proposed here, has great practical value. Furthermore, simulation software are expensive, while a closed-form formula within a spreadsheet software is cheaper and easier to use. Several specific scenarios in which such approximations are useful are described below. First, to meet the needs of automated decision-making within the so-called Cyber-Physical System (CPS) in the era of Industry 4.0 (Tao et al. 2018) , methods based on two moments are likely to be more attractive, as fitting distributions requires additional computational effort in terms of histogram fitting, analyzing different distributions, and finally employing a goodness-of-fit test, e.g., the Chi-square test and the Kolmogorov-Smirnov test, to select the best fit. In a CPS, decision-making and controls for hardware are exercised automatically through software written within so-called digital twins. In such systems, the requirement of using queueing models remains critical (Sinha and Roy 2019) , and therefore models rooted in means and variances will remain attractive because they can be encoded into in-built functions within the hardware of digital twins for rapid computations and control. Second, in traditional, computerized MRP (Materials Requirements Planning) systems, the proposed approximation based on means and variances of inputs will be useful in estimating lead times approximately. Since production data change frequently, determining distributions is typically ruled out and queueing estimates based on means and variances are popular (Askin and Goldberg 2002). Further, rough estimates of lead times are needed for determining the number of kanbans (Monden 1983), as well as for designing machine capacity (Heragu 2018); sub-optimally designed machine capacity leads to long lead times and increased operational costs (De Treville et al. 2004) . Third, large airports that witness major fluctuations in their demand-arrival patterns during the day use queueing models to determine server capacities-integrating queueing-performance formulas into numerical optimization techniques (Hafizogullari et al. 2003; Manataki and Zografos 2009) to minimize queue waiting times. Finally, hospitals serving critical patients in need of ventilators, e.g., during a pandemic where conditions can alter dramatically every hour, contain G/G/k queues. The ongoing COVID-19 pandemic is making it critical to determine the optimal number of ICU beds equipped with ventilators to save lives; for optimization, these systems need to be modeled as multi-server queuing systems (Raffensperger et al. 2020 ). In summary, simple closed-form approximations based on only the mean and variance of inputs (inter-arrival and service times), which can be executed in spreadsheet software or digital twins, continue to hold a special appeal in performance evaluation. Furthermore, even when exact techniques are available, simple approximations with a "back-of-the-envelope" nature (Whitt 1993) and the ability to perform "rough-cut optimization" (Papadopoulos and Heavey 1996) are attractive in practical, real-world settings. Such approximations are generally not very accurate; however, they can be used for rough-cut capacity optimization of machines. As such, even if the approximations are not very accurate, they rapidly provide usable estimates of lead times that help in quick decision-making. Contributions of this Paper The new approximation in this paper deviates from the literature as follows. It is based on an aggregation procedure of a G/G/1 queue and not on the M/M/k formula, unlike the trend in much of the literature (Lee and Longton 1957; Kimura 1986; Shore 1988; Page 1982; Sakasegawa 1977) ; see "Appendix" for definitions of various multi-server queues, including M/M/k. It is shown through extensive numerical testing that the new approximation exhibits more accurate behavior than that of existing G/G/k approximations from the literature (Marchal 1985; Kraemer and Langenbach-Belz 1976) . The new approximation is also benchmarked against simulation, as the latter has been commonly used in the literature for that purpose (Altiok 2012; Rabta 2013). The aggregation procedure within our proposed approximation first condenses a multi-server queue into a fictitious single-server queue, via a correction factor for the squared coefficient of variation of the service time, and then retrieves the original single-server queue via another correction factor. Furthermore, the new approximation is developed for Assumptions A5 and A6, which are commonly true of conditions found in many real-world systems, but not studied as widely in the literature. Finally, the new approximation is based on only the mean and variance of two key queueing inputs, the inter-arrival time and the service time, which makes it suitable for automatic computations in manufacturing systems and in airports and hospitals where conditions can change rapidly. The rest of this article is organized as follows. Theoretical background material and notations for the research in this paper are provided in Sect. 2. Section 3 details the methodology underlying the new approximation. Section 4 presents numerical results obtained from using the new methodology and the benchmarking exercises. Finally, Sect. 5 presents the conclusions drawn from this research, as well as directions for future research. The section begins with mathematical notation and then discusses two benchmarking models. Two approximations, described in the following two subsections, have been selected for benchmarking of the new approximation. The reason for selecting them is: they also rely on only the mean and variance of the inter-arrival and service times, making them comparable. Further, both of these approximations are rooted in the socalled M/M/k model, which has been used widely in the literature to develop approximations for multi-server queues. Marchal (1976) developed an approximation for G/G/1 queues that was combined with the exact M/M/k formula to develop an approximation for G/G/k queues (Marchal 1985) . His G/G/1 approximation, which holds under Assumptions A1: A4, is shown below: The existing exact formula for an M/M/k queue (Ross 2014 Note that P 0 above denotes the probability that there are zero customers in the system. Based on his G/G/1 approximation (given in Eq. (2)), Marchal (1985) developed a correction factor, denoted by CF, that when applied to the M/M/k formula works as an approximation for the G/G/k queue. The correction factor is given by: Combining Eqs. (3) and (5), one has the following approximation (Marchal 1985) for the G/G/k queue: where P 0 is as defined in Eq. (4). The above approximation will be referred to as the MAR (short for Marchal) approximation. Kraemer and Langenbach-Belz (1976) developed the following approximation for the G/G/1 queue, which holds under Assumptions A1: A4: where For benchmarking his own approximation, Marchal (1985) suggested an alternative correction factor from the single-server approximation in Kraemer and Langenbach-Belz (1976) , which was: in which g is as defined in Eqs. (8)-(9). This leads to the following approximation for the G/G/k queue: when C 2 a ≤ 1; where P 0 is as defined in Eq. (4). The above G/G/k approximation will be referred to as the K-L-B (short for Kraemer and Langenbach-Belz) approximation in this paper. The underlying principle in the new multi-server aggregation procedure, referred to as M-SAP for short, is to aggregate (or transform) a single-channel, multi-server queue into a hypothetical single-server queue with the same utilization, develop an estimate for the squared coefficient of variation in the hypothetical single-server queue, and then employ this estimate within an existing approximation for G/G/1 queues to evaluate the original multi-server queue's key performance metrics, i.e., the expected waiting time and the expected number in the queue. This last step is performed via a correction factor that retrieves the original multi-server queue. Steps in M-SAP are outlined as follows in order to first provide an overview of this procedure: Step 1: (Aggregation) The G/G/k will be aggregated into a hypothetical G/G/1 queue, i.e., the mean and variance of the service time of this hypothetical G/G/1 queue will be computed via an aggregation procedure, whose details are provided below in Sect. 3.1. Step 2: (Single-Server Approximation) The expected queue length of the hypothetical G/G/1 queue, denoted by L q , will be computed using either the MAR or the K-L-B approximation and the aggregation procedure of Step 1; the details are provided below in Sect. 3.2. Step 3: (Retrieval) The expected queue length ( L q ) of the original G/G/k queue will be obtained via details shown in Sect. 3.3. In what follows, the three steps in the procedure are described in detail. The objective here is for the service process in the aggregated single-server queue to behave in a manner similar to that in the original multi-server queue. To this end, a new squared coefficient of variation for the service time of the aggregated queue, Ĉ 2 s , is proposed. The intuition underlying the aggregation is that approximations in queueing networks are often handled via modifications of the squared coefficients of variation of either the service time or inter-arrival time (Buzacott and Shanthikumar 1993), which should ideally lead to balanced behavior in the final result. By "balanced behavior," one refers to behavior in the middle of the spectrum of values obtained of the mean queue length, rather than at the extremes. For instance with the two variables, the squared coefficients of variation of inter-arrival and service times, both variables at the same level (high or low) would represent the middle of the spectrum. On the other hand, when the two variables are at conflicting levels, one would obtain behavior at the two ends of the spectrum: one variable at a high level and the other at a low level would represent one extreme of the spectrum, while one at a low level and the other at a high level would represent the other extreme of the spectrum. Hence, the balance here is rooted in the notion that when the variability in one of the inputs (interarrival and service times) is high (low), that in the other inputs should also be high (low) to obtain reliable estimates. Although the intuition suggests this kind of behavior, the exact thresholds for what is considered "high" and "low" and the precise expression for modifying the squared coefficient of variation are determined empirically in this paper, i.e., via computational experiments to determine which threshold and which modification leads to the best results vis-á-vis the results from simulations. This entails trial-and-error based experimentation with different combinations of high and low values and benchmarking against simulation to determine which combination delivers the best performance. Performing computational experiments to identify a suitable replacement for an existing term is common in queueing approximations, although this is a tedious process. For instance, see Sakasegawa (1977) 3 ) in which D denotes deterministic (constant); the reason for this replacement is justified there on grounds of empirically satisfactory results. Also, finding thresholds for determining fields of satisfactory behavior is also common in queueing literature. For instance, the classical heavy-traffic threshold, > 0.8 , above which heavy-traffic approximations rooted in the normal distribution are known to Fig. 4 Schematic showing the aggregation scheme in M-SAP that aggregates the k servers of a G/G/k queue into one server to generate an aggregated single-server queue with modified mean and variance of the service time, as well as the retrieval to obtain the original G/G/k queue work in a satisfactory manner, has been determined via computational experiments (Whitt 1993) . From our extensive experimentation, the following thresholds and approximate formulas are proposed herein: -When the variability in the inter-arrival time is low, i.e., C 2 a < 0.3 : the effect of the variability in the service time should be lower to maintain balance and hence the variance is reduced by the number of servers, k: -When the variability in the inter-arrival time is high, i.e., when C 2 a >= 0.3 : the effect of the variability in the service time should be magnified, again to maintain balance, and hence the variance is multiplied by the number of servers, k: The two formulas above will be combined as for convenience of representation: Since we consider one server to replace a multi-server queueing system, it is necessary to divide the arrival process into k equal parts, and therefore the arrival rate to the aggregated queue will be ∕k ; otherwise, one will have an unstable system. The service rate of the aggregated single server will be . Taken together, this implies that in the aggregated queue: The above is a necessary condition for consistency with the value of utilization in any multi-server queue (Whitt 1993) which should be less than one for stability. Step 2 will employ a G/G/1 approximation for the aggregated single-server queue. Within the approximation, the squared coefficient of service will be used as defined above by Eq. (12) and as defined above by Eq. (13); the value of C 2 a will not be altered. A regime, defined herein as a well-defined area on the graph of which the x-axis is C 2 a and the y-axis is C 2 s , is constructed for estimating the value of L q , i.e., the estimated mean length of the aggregated queue. The regime is described via four sub-areas or scenarios that have been identified on the graph. See Fig. 5 for the geometric structure of this regime. As stated above, extensive computational experimentation involving trial and error was conducted that led us to conclude that if the K-L-B rule is used within M-SAP for the hypothetical single-server queue, Eq. (9) works more accurately than Eq. (8) for calculating g. The approximation formula needed in each scenario within the regime is presented below. Scenario 1: Conditions: C 2 a < 0.30;C 2 s ≤ 0.15 : Use the MAR G/G/1 approximation given in Eq. (2) to compute L q . Scenario 2: Conditions: C 2 a > 0.3;C 2 s ≤ 0.15 : Use MAR G/G/1 approximation provided in Eq. (2) to compute L q . Scenario 3: Conditions: C 2 a < 0.3;0.15 < C 2 s ≤ 1 : Use the K-L-B G/G/1 approximation found in Eq. (7) using the value of g computed via Eq. (9) to compute L q . Scenario 4: Conditions: C 2 a > 0.3;0.15 < C 2 s ≤ 1 : Use the MAR G/G/1 approximation given via Eq. (2) to compute L q . In this retrieval step, the value of the mean queue length of the original queue is obtained via the following equation that seeks to compress the elongated hypothetical queue by k: The mean wait in the queue can now be computed via Little's law, i.e., Eq. (1). The intuition underlying the proposed approximation for the mean queue length, i.e., Eq. (14), is as follows: Since k servers are aggregated, the resulting variability in the aggregated (fictitious) server is artificially higher, which must be adjusted for in the final calculation. This adjustment is performed by dividing the queue length of the single-server, aggregated queue obtained from the previous two steps by k. The numerical testing with M-SAP as well as that for the benchmarking techniques was performed under the condition: C 2 a < 1 . This condition is standard for most manufacturing, airport, and hospital systems; also when the variance is so high that C 2 a exceeds 1, higher-order moments are often needed (Buzacott and Shanthikumar 1993; Shore 1988; Marchal 1985) , which is beyond the scope of this work. Computer programs for implementing M-SAP were run on a personal computer in a university setting that used an Intel Pentium Processor with a speed of 2.66 GHz on a 64-bit operating system. The simulation programs used the software ARENA. The M-SAP, MAR, and K-L-B approximations were implemented within the software MATLAB because it provided great flexibility in programming; however, this task can just as easily be carried out in spreadsheet software. The MATLAB program required no more than 5 seconds for any given scenario, while the simulations with ARENA used 10 replications each and needed about 55 seconds per scenario. In addition, every scenario required benchmarking via the two G/G/k models based on MAR and K-L-B; this also needed no more than 5 seconds per scenario. In all, for the four scenarios, a total of 91 cases were tested. The approximation was tested under the following conditions: (i) gamma distribution for inter-arrival times, (ii) = 1∕5 , and (iii) = ∕(k ) was approximately 0.67 (medium traffic); the value for was varied as follows. For k = 2 , = 0.15 ; for k = 3 , = 0.1 ; for k = 4 , = 0.075 ; for k = 5 , = 0.06 ; for k = 6, = 0.05; for k = 7, = 0.043 ; and for k = 8 , = 0.0375 . The different parameters for the interarrival times are shown in Table 1 . Other double-tapering distributions were not chosen for the inter-arrival time as no evidence was found for them in the literature as suitable choices for the inter-arrival time. Figure 6 shows a screenshot of the computer program written in ARENA. Key details of this program are as follows: The main computer program is comprised of three modules, CREATE, PROCESS, and DISPOSE. Customers (entities) enter the system through the CREATE module, where the parameters of the inter-arrival time distribution are specified using the following ARENA format: GAMM (scale, shape), where GAMM denotes the gamma distribution. Within the PROCESS module, the parameters of the service time are specified from one of the following three choices for the model studied here: GAMM (scale, shape) for the gamma distribution and TRIA(minimum, mode, maximum) for the triangular distribution. The DISPOSE module allows entities to leave the system. The capacity of the server is specified in the bottom window and it equals k. The time for which the computer program is run and the number of replications is set within the execution panel (not shown in the figure) . Results from the computational work are provided in Tables 2, 3 , 4, 5, 6, 7, 8, 9. In these tables, SERT is used to denote service times, and the following acronyms are used for three distributions: Ga (scale, shape) for the gamma distribution and T (minimum, mode, maximum) for the triangular distribution. The error against simulation for the approximation was defined as follows, following the literature (Rabta 2013) : where Approx represents M-SAP, MAR, or K-L-B and Sim denotes simulation. M-SAP delivers good performance with the error in the range of 1-15% in most cases; occasionally the error exceeds 30%, but this is rare compared to MAR and K-L-B. In fact, MAR and K-L-B deliver large errors frequently with their errors, exceeding even 150% in many cases. What is important to note is that the performance of M-SAP is consistently reliable, whereas it is difficult to predict where MAR and/or K-L-B perform well. It should also be reiterated here that errors are unavoidable with these approximations, as they do not use distributions of the inter-arrival and service times (Whitt 1993; Sakasegawa 1977) . However, this approximation delivers reasonable results in settings where distribution fitting is ruled out, as discussed in Sect. 1. There are cases where MAR or K-L-B perform well, but no pattern can be found for that except for the following condition: 0.7 < C 2 a ≤ 1 . Under this specific condition, K-L-B and MAR perform extremely well because as C 2 a approaches 1, the interarrival time distribution starts approximating the exponential distribution; approximations rooted in the M/M/k formula used by MAR and K-L-B are then naturally appropriate, leading to good performance. While this specific condition is not common in the systems studied in this paper, computational results are provided within the "Appendix" to demonstrate the good performance of approximations from the literature under this condition. Finally, optimization was performed to illustrate how the M-SAP model is useful for optimizing server capacity. The goal here is to determine the minimum server capacity at which the mean waiting time is lower than a pre-set upper threshold. Mathematically, this implies: The optimal value of k obtained from the optimization exercise is denoted by k * , while the minimum server capacity needed to obtain a stable system is denoted by k . Note that k can be obtained for any queue by finding the minimum integer at which k < 1 . It should also be mentioned that at a server capacity of k , the mean wait times are expected to be very long, although finite. Two cases from the COVID-19 pandemic were used for optimization using available data. The first case is representative of an urban area where the arrival rate is Minimize k such that W q < T where T is a pre-set threshold. likely to be higher, while the second one is representative of a rural area where the arrival rate is likely to be lower. Urban Area Hospital from NHS Data Data from the National Health Service (NHS), UK, from the peak of the pandemic in 2020 were gathered from the website (Data 2020), where NHS has made data available. The raw data are provided in the "Appendix" for the reader's convenience. This data led to the following estimates for the length of stay: 13.1972 days with a variance of 4.5456 days-squared. This implies = 1∕(13.1972) and C 2 s = 0.1186 . The variance in the inter-arrival time is not provided at the NHS website, but will clearly vary from place to place and hence was estimated from other sources. It must be noted, however, that the model used (M-SAP) is general and should be applicable for any given dataset, provided one has access to the mean and variance of the inter-arrival and service times. The interarrival time in an urban area was assumed to be 1 per week, i.e., = 1∕7 per day with a gamma distribution whose C 2 a = 0.15 , from existing data (Raffensperger et al. 2020 ). The optimization was performed via performance evaluation at each value of k using T = 0.025 day or 36 min. Since the M-SAP approach carries out performance evaluation in a very short time period on a computer (requiring no more than 5 seconds), no optimization algorithm was used, but rather the performance was evaluated at all feasible values of k. For this case, k = 93 and k * = 118 . Figure 7 plots the mean waiting time versus the number of ventilators (k) for these data. When k =k = 93 , i.e., k satisfies the stability condition, the mean wait is 1.296 days, which exceeds the threshold, T. Rural Area Hospital from United States The inter-arrival time in an urban area from the United States was assumed to be 1 per week, i.e., = 1∕7 per day with a gamma distribution whose C 2 a = 0.15 . The service time was assumed to have a gamma distribution with a mean of 9.1 days, i.e., = 1∕9.1 per day, and C 2 s = 1∕2 ; both the inter-arrival time and service times in this case were based on data from Raffensperger et al. (2020) . T was set to 0.025 day as in the urban area case. The stability value of ventilator capacity, k , here is 2 while the optimal value, k * equals 5. The resulting plot of the mean wait against ventilator capacity is shown in Fig. 8 . At the stability condition, i.e., k =k = 2 , the mean waiting time is 1.1165 days, which exceeds the reasonable threshold of 36 min, as in the urban case. A motivating factor for this research was the need to develop closed-form multiserver queueing approximations under the following conditions: (a) traffic intensity is medium, (b) the inter-arrival time is not exponentially distributed but carries a double-tapering distribution, and (c) the service time also has a double-tapering distribution. In particular, in many real-world settings, e.g., airports, hospitals, and manufacturing systems, all three conditions apply, which rule out the usage of the fairly accurate, existing M/G/k models or the heavy traffic approximation for G/G/k queues. The non-Poisson arrivals and non-Poisson service rates make these systems difficult to approximate in closed form (Gupta et al. 2010 ). In the context of a hospital, it must be noted, inaccuracies often lead to under-designed systems with lengthened waits, and waiting beyond acceptable thresholds can cause the patient's death. In airports and factories also, poorly designed systems can cause long, harmful delays. While discrete-event simulation does provide a reliable mechanism to solve problems of this nature, it requires (a) expensive software and (b) distribution fitting. Further, simulations of G/G/k systems can become unacceptably sluggish for large values of k. Therefore, in the real-world, closed-form approximations based on only the mean and variance that are usable within spreadsheet software continue to remain of practical importance. term) behavior. The condition 0 < < 0.5 is defined as low traffic, 0.5 < ≤ 0.8 as medium traffic, and 0.8 < < 1 as high traffic. -Queuing Discipline: This is the order in which customers are served. First come first served implies that within a pool of customers waiting in line, the customer who enters first is served first. Other queuing disciplines include last in first out and shortest time first etc. -Correction Factor: This is a scalar quantity often used in queueing approximations to multiply the performance metric of a class of queues (e.g., M/M/k) to obtain the corresponding metric for another class of queues (M/G/k). The approximation proposed in this paper can be presented in the following user-friendly format for programming as follows: If C 2 a >= 0.7 , use MAR or K-L-B. Otherwise: -First compute Ĉ 2 s via Eq. (12) and via Eq. (13). -Then, compute the mean length in the G/G/k queue as (Tables 10, 11, 12): 3 5 1.00 T (4, 15, 31) 0.11 1.9310 6.39 6.35 of the manuscript and for providing numerous suggestions that significantly improved the paper. In particular, the authors thank the first reviewer for the added discussion on using the discrete-event simulation program, the second reviewer for suggesting an appropriate acronym, M-SAP, for the new model, and the third reviewer for numerous comments and suggestions, including one that led to providing a justification for the model. Ga (3.0303, 10) Ga (3.0303, 10) Ga (3.0303, 10) Ga (3.0303, 10) Ga (3.0303, 10) Ga (3.0303, 10) Optimum alternatives of tandem g/g/k queues with disaster customers and retrial phenomenon: interactive voice response systems On the effect of product variety in production-inventory systems Breaking the dimensionality curse in multi-server queues The gamma distribution and inventory control New Jersey Chydzinski A (2020) Queues with the dropping function and non-Poisson arrivals Optimal preventive maintenance in a production inventory system From supply chain to demand chain: The role of lead time reduction in improving demand chain performance Sharp bounds on Laplace-Stieltjes transforms, with applications to various queueing problems On the inapproximability of M∕G∕K : Why two moments of job size distribution are not enough Simulation's role in baggage screening at the airports: a case study Queueing in high-performance packet switching Singh P (2020) Supplementary variable technique (SVT) for non-Markovian single server queue with service interruption (QSI) The triangular distribution as a proxy for the beta distribution in risk analysis Simulation analysis of passenger check-in and baggage screening area at Chicago Rockford International Airport Supervised-learning-based approximation method for multi-server queueing networks under different service disciplines with correlated interarrival and service times A two-moment approximation for the mean waiting time in the GI∕G∕s queue Approximations for multi-server queues: system interpolations Approximations for the delay probability in the M∕G∕s queue Approximate formulae for the delay in the queueing system GI∕G∕1 The waiting-time process of a queueing system with gamma-type input and blocking Queueing process associated with airlines passenger check-in A generic system dynamic based tool for airport terminal performance analysis The optimizing of the passenger throughput at an airport security checkpoint An approximation formula for waiting times in single-server queues Numerical performance of approximate queuing formulae with application to flexible manufacturing systems Amsterdam Monden Yasuhiro (1983) Toyota production system. An Integrated Apprpach to Just-In-Time Muralidhar K, Swensethj SR, Wilson RL (1992) Describing processing time when simulating JIT environments Probabilities for queueing systems with embedded Markov chains Data for length of stay in UK made available by NHS Tables of waiting times for M∕M∕n , M∕D∕n and D∕M∕n and their use to give approximate waiting times in more general queues Queueing theory in manufacturing systems analysis and design: a classification of models for production and transfer lines A hybrid method for performance analysis of G/G/m queueing networks Planning hospital needs for ventilators and respiratory therapists in the COVID-19 Crisis Estimating the implied value of the customer's waiting time An overview of queuing delay and various delay based algorithms in networks. Computing An approximation formula l q = ∕(1 − ) A neural network procedure for kanban allocation in JIT production control systems Simple approximations for the GI∕G∕c queue-I: the steady-state probabilities Scheduling status update for optimizing age of information in the context of industrial cyber-physical system Air passenger demand forecasting and passenger terminal capacity expansion: a system dynamics framework Analysis and autonomic elasticity control for multi-server queues under traffic surges Data-driven smart manufacturing Approximations for the GI∕G∕m queue Dealing with highly skewed hospital length of stay distributions: the use of Gamma mixture models to study delivery hospitalizations Optimal control of arrivals in a G/G/c/K queue with general startup times via simulation A novel packet switching framework with binary information in demand side management The authors want to thank the three anonymous reviewers for their careful reading The novelty of this work lies in developing a new scheme to aggregate a single-channel, multi-server queue into a fictitious single-channel, single-server queue with the same utilization. The scheme allows one to exploit existing, accurate G/G/1 approximations to develop formulas for mean waiting times, rather than the M/M/k formula used extensively in the literature. A conclusion from this study is that for medium-traffic, multi-server queues in manufacturing and service systems, where the inter-arrival time density function and the service time density function are double tapering, M-SAP performs well consistently in comparison to the existing MAR and K-L-B approaches from the literature. The research also leads to new insights about the performance of MAR and K-L-B in systems where the performance gradually improves as the inter-arrival time's distribution starts approaching the exponential distribution. Future research in this topic should be directed toward using higher-order moments to address the condition k ≥ 10 and developing approximations for variance of the waiting times in G/G/k queues. Queueing Notation Glossary: We provide a glossary of terms commonly used in queueing theory for the convenience of the reader:-Generally Distributed Random Variable: This is a random variable that can have any given distribution. -C 2 : Squared coefficient of variation of a random variable: This the variance of a random variable divided by the square of its mean. -Double-Tapering Distribution: This is a continuous random variable who probability density function tapers on both sides to zero. -M/M/k Queue: This is a queue with infinite waiting capacity in which there are k servers in parallel, and both inter-arrival and service times have the exponential distribution. M denotes Markovian, which means exponential distribution here, and the first and second letters in this notation denote the distributions of the inter-arrival and service times, respectively. The mean length of this queue can be computed using the well-known, exact formula given via Eqs. (3) and (4). -M/G/k Queue: This a queue with infinite waiting capacity in which there are k servers, the inter-arrival time has the exponential distribution, and the service time has any given distribution. The latter is also often described as the service time being generally distributed. -G/G/k Queue: This a queue with infinite waiting capacity in which there are k servers, and both the inter-arrival and service times have any given distribution. -Poisson Arrival and Service Processes: A Poisson arrival process to a system implies that the inter-arrival time is exponentially distributed. Similarly, a Poisson service process implies that the service time is exponentially distributed. -: This is the traffic intensity of the queue (which equals the proportion of time the servers are busy). This is a positive number and it has to be less than 1 for a queue to be stable. In general, only stable queues can be analyzed for steady-state (long-