key: cord-0074564-yynex8ni authors: Moghaddas, Zohreh; Amirteimoori, Alireza; Kazemi Matin, Reza title: Selective proportionality and integer-valued data in DEA: an application to performance evaluation of high schools date: 2022-02-07 journal: Oper Res Int J DOI: 10.1007/s12351-022-00692-3 sha: 2db589d0d47d08414daaee65c9bc6c42e8e44a30 doc_id: 74564 cord_uid: yynex8ni Conventional data envelopment analysis (DEA) models are often extended for constant or variable returns to scale assumptions based on the under-investigated technology. It is assumed that all inputs and outputs are real-valued data. However, in many practical applications, proportionality or convexity axioms require to be modified. This study attempts to further expand upon the hybrid returns to scale DEA models in the presence of integer-valued input and output data. We refine the previous axioms to introduce a new minimal extrapolation technology set. Moreover, we formulate a couple of mixed-integer linear programming models for efficiency evaluation and target setting. An empirical application on 30 high schools in Iran is provided to validate the proposed approach. The data analysis, including efficiency evaluations along with providing benchmark units, is also performed. Data envelopment analysis (DEA) is a non-parametric mathematical programming technique originally introduced by Charnes et al. (1979) and extended by Banker et al. (1984) . This data-oriented technique is utilized for performance assessment of a set of homogeneous decision making units (DMUs) which consume multiple inputs to produce multiple outputs. Conventional DEA models are often built on some basic information about returns to scale assumptions such as constant or variable return to scale for the underlying production set (Cooper et al. 2006) . It is realized that input and output data in real applications of performance evaluation of production systems are sometimes integer-valued. In this sense, to conduct indepth performance analysis, it is important to modify the traditional DEA models to take into discrete types of input/output data. Neglecting the integrality assumption of inputs and/or outputs may lead to the overestimation of results in the efficiency evaluation of production units. At the first sight, it seems that rounding to the nearest integer target is a rational procedure to get an optimal target. However, this method may lead to inefficient or even infeasible target points and misleading efficiency evaluations. Handling integer data in a modified DEA model was first studied in the DEA framework by Lozano and Villa (2006) and then extended and improved by Kuosmanen and Kazemi Matin (2009) , Kazemi Matin and , Kuosmanen et al. (2015) . As Podinovski (2004) discussed, in many real DEA applications, the full proportionality between all inputs and outputs also cannot be assumed. This means that it is possible to have a subset of inputs that are not proportional to a subset of outputs in the evaluation. These issues show that there is a need to modify the existing approaches in such a way that they can be applied for the cases that selective proportionality and discrete data coexist. This work extends the hybrid returns to scale (HRS) models in DEA for the case of integer-valued input and output data by taking an axiomatic approach. An associated minimal extrapolation technology set is then introduced as well as some mixed integer linear programming (MILP) models that are suggested for efficiency evaluation in the resulting technology. To demonstrate the applicability and capability of the new approach, two examples are provided. This paper is organized as follows: A literature review on integer DEA and HRS models is given in Sect. 2. Section 3 briefly presents the concept of selective proportionality in DEA frameworks. The HRS technology is extended in Sect. 4 and a new Farrell-type efficiency measure is presented for the treatment of integer-valued input and output. Section 5 discusses a descriptive application of performance evaluation of 30 high schools in Iran. The paper concludes in Sect. 6. Different assumptions about DEA axioms and various types of data have been widely studied in DEA literature for better analyzing the performance of production units. The first two basic DEA models, i.e., CCR (Charnes et al. 1978 ) and BCC (Banker et al. 1984 ) models, have been developed based on constant returns to scale (CRS) and variable returns to scale (VRS) technologies, respectively. Thenceforth, a wide range of DEA models has been introduced to deal with the performance evaluation problem with different assumptions. These models possess their advantages and limitations. Podinovski (2004) highlighted that the VRS technology is conservative and may overestimate the true efficiency scores in many cases. On the other hand, the underlying full proportionality assumption in CRS technology does not hold in all cases. Consequently, the author introduced a novel hybrid returns to scale (HRS) technology to make a bridge between constant and variable returns to scale technologies. This type of hybrid scenario is also used in other studies in DEA literature; for example, in efficiency measurement with mixed orientation in Wu and O'Brien (2010) , where a mixed radial DEA model is suggested and then it is verified that the obtained results lie within solutions of input and output orientations. In the HRS models, it is possible to build the technology set based on a selective proportionality axiom. This can be done by a modification on full proportionality axiom in the CRS technology. The HRS production technology is a convex polyhedral set where a subset of inputs and outputs operates under the CRS assumption while other inputs and outputs activate under the VRS assumption. The most important feature of HRS technology is that it simultaneously covers both quantity and quality of data in a unified production set which often appears in technologies with different types of returns to scale (RTS) assumptions. Several extensions and applications of HRS technologies have been reported in the recent DEA literature. Podinovski (2009) developed a production possibility set that exhibits both full and selective proportionality in a unified set. Cook and Zhu (2011) introduced a new multiple variable proportionality (MVP) concept as an extension of HRS technology. The MVP shows how output bundles are separated into distinct subsets with different types of RTS. Kazemi Matin and Emrouznejad (2011) utilized a variant of the HRS assumption and developed a bounded technology set by taking into account integer-valued data. Alirezaee and Boloori (2012) proposed a proportional model of trade-offs in the DEA framework by modifying the proportionality axiom and by considering some input-output replacement. Huang et al. (2012) considered a hybrid DEA model and evaluated proportionate and non-proportionate inputs with radial and non-radial measures in an application to Taiwanese international tourist hotels. The authors utilized the HRS to evaluate the impact of marketing expenses on subjects. Podinovski et al. (2014) reported another application of HRS for performance assessment of secondary schools. Recently, Afsharian et al. (2015) proposed a modification of free disposal hull (FDH) technology by incorporating the selective proportionality axiom. The HRS model incorporating production trade-offs was developed by Podinovski et al. (2017) with an application to the efficiency assessment of public universities in Malaysia. More recently, Ferreira et al. (2018) and Ferreira and Marques (2020) proposed an extension of the traditional order-α method to estimate an empirical convex α-level. They proposed a step forward on the order-α robust nonparametric method for technical efficiency assessment. As Ferreira and Marques (2020) stated, the order-α robust nonparametric method is computationally complex and expensive. In other words, in the DEA framework with at most 500 DMUs, we prefer to solve a MILP model instead of the order-α method. To the best of our knowledge, no research has directly considered integer-valued data in HRS models. This paper is the first attempt that takes the assumptions of the integrality of the data and selective proportionality of inputs and outputs into consideration. It is initially assumed that a set of n decision making unit (DMUs) exists where DMU j , (j ∈ {1, ..., n}) . consumes m inputs x j = x ij , i ∈ {1, ..., m} to produce s outputs y j = y rj , r ∈ {1, ..., s} . Basic DEA models are built upon a technology set that satisfies a set of underlying axioms of disposability and returns to scale properties. A common example could be introducing the following basic axioms for the CCR production set T c : A5 . Closedness: T c is a closed set. A6 . Minimum extrapolation: T c is the intersection of all sets that satisfy axioms Ȧ 1 −Ȧ5. Regarding the axioms Ȧ 1 −Ȧ6 , the associated production possibility set (PPS) could be stated as follows: For deriving the VRS technology set, axiom Ȧ 3 from the CRS technology is ignored and the associated minimal extrapolation technology set is constructed by Banker et al. (1984) as follows: The DEA CRS form of technology assumes full proportionality among all inputs and outputs while VRS refers to no proportionality. In most practical situations, however, only some of the inputs and the outputs may be assumed to be proportional to each other. The remaining variables are not part of the proportion. Podinovski (2004) addressed similar circumstances and introduced a hybrid model that combines CRS and VRS assumptions. In this model, the CRS assumption is considered for selected sets of inputs and outputs while the VRS assumption holds for the other data. Podinovski (2004) claimed that the discriminating power of an HRS model is better than either VRS or CRS formulations in most practical situations. In what follows, we partition the set of inputs and outputs into two groups: those inputs and outputs that are proportional to each other and those that are not. We use the superscript "P" and "NP" to show proportional and non-proportional input-output variables, respectively. To this end, consider the following partitions according to the selective proportionality: It is noted that O P is assumed to be proportional to I P while I NP and O NP are not. Subsets " P " and " NP " of inputs and outputs are non-empty, mutually exclusive and collectively exhanustive. Thus, each production plan can be stated as below where x P ∈ ℝ |I P | , y P ∈ ℝ |O P | , x NP ∈ ℝ m−|I P | , and y NP ∈ ℝ s−|O P | . Podinovski (2004) defined the HRS technology based on the following axioms: A1. Observations inclusion: For all j : x j , y j ∈ T HRS . A2. Convexity: T HRS is a convex set. A3. Free disposability: If (x, y) ∈ T HRS , x ≤ x and 0 s ≤ y ≤ y then (x, y) ∈ T HRS . A4. Selective proportionality: Let (x, y) ∈ T HRS , then ( x, y) is defined in expansion and contraction scenarios as follows: Podinovski (2004) proved that under the convexity assumption the contraction scenario can be stated as follows: A5. Closedness: T HRS is a closed set. A6. Minimum extrapolation: T HRS is the minimal set that satisfies axioms A1 − A5. Regarding the axioms A1 − A6 , it is shown that the associated production possibility set (PPS) can be stated as follows: Expansion scenario x P , x NP , y P , y NP ∈ T HRS , ∀ ( > 1) Contraction scenario x P , x NP , y P , y NP ∈ T HRS , ∀ (0 ≤ < 1) In the presentation of the production set T HRS in (5), j ≤ 1 and j ≥ 1 are respectively expansion and contraction factors. Different assumptions in DEA axioms have now been considered for constructing a variety of technologies for better formulating and analyzing practical production systems. Podinovski (2009) considered full and selective proportionality simultaneously and applied the proposed models to a practical example. As a result, he noted that there is a notable difference in the discrimination powers of CRS and HRS models. Based on the above discussions and in order to produce more reliable results in performance assessments of production units, the common DEA models need to be modified in order to address applications where selective proportionality in the presence of integer-valued input and output data. We employ an axiomatic approach aims at addressing the integer-valued data in the HRS models. For simplicity of presentation and without loss of generality, it has been assumed that all input and output data can only take integer values. This assumption would be relaxed later. Consider the following axiom defined in integer environment by Kuosmanen and Kazemi Matin (2009): Obviously, the integrality assumption is in contradiction with standard DEA assumptions such as free disposability and convexity assumptions. This indicates that traditional axioms need to be modified to meet new conditions. To do this, they introduced the following axioms for contraction and expansion scenarios: Natural divisibility: If (x, y) ∈ T and ∃ ∈ [0, 1) such that ( x, y) ∈ Z m+s , then ( x, y) ∈ T. Natural augmentability: If (x, y) ∈ T and ∃ > 1 such that ( x, y) ∈ Z m+s , then ( x, y) ∈ T. These axioms are integer-restricted counterparts of non-increasing and nondecreasing RTS axioms of conventional DEA models. Based on our earlier discussion on selective proportionality, we propose the following set of axioms for taking into account both integrality and proportionality relations in a unified technology set, T . B4. Natural selective proportionality: a. If (x, y) ∈T&∃ > 1 , while considering the definition of expansion scenario for ( x, y) in (3), with ( x, y) ∈ Z m+s + then ( x, y) ∈T. b. If (x, y) ∈T&∃0 ≤ < 1 , while considering the definition of contraction scenario for ( x, y) in (4), with ( x, y) ∈ Z m+s + then ( x, y) ∈T. T is the minimal set that satisfies axioms B1 − B5. The third axiom B3 states that if we can produce a certain quantity of outputs with a given quantity of input (all with integer values), then we can also produce less integer-valued outputs with more integer-valued inputs. The notion of natural disposability (B3) can be interpreted as the integer-restricted counterpart of free disposability. Standard convexity is also restricted with integer-valued data and written by (B2) as natural convexity. Both parts of (B4) can be interpreted as discrete variants of RTS axioms since they limit continuous re-scaling of input and output data. The following theorem proves that under our adapted set of axioms, reference technology T IDEA HRS = T HRS ∩ Z m+s + is the smallest set among all possible technologies that satisfy the mentioned axioms. HRS is the intersection of all sets that satisfy the axioms of feasibility (B1), natural convexity (B2), natural disposability (B3), and natural selective proportionality (B4). Proof See Appendix 1. Referencing Theorem 1, T IDEA HRS is the minimum extrapolation technology set under the refined set of axioms, B1 − B4 . Now consider the following PPS of the HRS technology that is consist of all integer-valued input and output data: To avoid non-linear terms in (6) and following Podinovski (2004) , we use some variable substitutions as follows: Let, j + j − j = j j j and j − j = j j , j = 1, … , n. Accordingly, we have; This means that the technology set T IDEA HRS can be equivalently state as: T IDEA HRS is linear in terms of the unknown variables j , j and j . The right-hand side of the constraints of T IDEA HRS are free of scaling variables. Having access to the structure of T IDEA HRS , now we can proceed to evaluate the performance measurement of any production units in the new technology set. Analysis step is now performed by constructing the true production set based on the refined axioms by computing efficiency measures for the observed units. To do this, we adapt Farrell (1957)'s input radial measure in the new discrete production set T IDEA HRS to suggest the following radial measure for evaluating DMU k : Adapting this radial measure with the linear version of the HRS technology, introduced by Podinovski (2004), we provide the following input-oriented MILP model for measuring efficiency score of DMU k : The proposed model requires the outmost input radial reduction to reach an integervalued point in the corresponding PPS that dominates DMU k . This approach is different from the first proposed model by Lozano and Villa (2006) . For dealing with Paretoefficiency, the second phase optimization model is needed to obtain non-dominated target points in the corresponding PPS. A modified second phase model is suggested in Jie et al. (2015) . In the presence of proportional inputs/outputs and integer-valued inputs, this model can be formulated as follows: Model (10) is solved to obtain benchmark units. Having access to the optimal solution of this model, the benchmark units can be obtained from the following formula: In this section, we apply our provided approach for illustration and comparison purposes. First, we present a simple example to compare the results of the traditional and the new models in the efficiency evaluation of the observed units. Then we apply our provided approach in the performance assessment of 30 high schools in Iran. To illustrate the applicability of the proposed approach, we take data for 10 hypothetical DMUs, containing single input i 1 and two outputs o 1 and o 2 , where all the data is assume to be integer. It is also assumed that i 1 and o 1 are proportional. The input/output data set are listed in Table 1 . Let DMU 5 be the unit under evaluation. When the input and the first output are proportional, the proposed model by Podinovski (2004) may be stated as follows: By computing the above linear programming model the optimal (non-zero) results of assessing DMU 5 is * = 0.97, * 3 = 0.32, * 4 = 0.68, * 4 = 0.07. Now consider the new model presented for dealing with both proportionality and integer-valued data: (12) min s.t. 33 1 + 1 − 1 + 39 2 + 2 − 2 + 35 3 + 3 − 3 + 32 4 + 4 − 4 + 36 5 + 5 − 5 + 35 6 + 6 − 6 + 34 7 + 7 − 7 + 40 8 + 8 − 8 + 38 9 + 9 − 9 + 39 10 + 10 − 10 ≤ 36 22 1 + 1 − 1 + 54 2 + 2 − 2 + 76 3 + 3 − 3 + 84 4 + 4 − 4 + 87 5 + 5 − 5 + 43 6 + 6 − 6 + 33 7 + 7 − 7 + 45 8 + 8 − 8 + 23 9 + 9 − 9 + 54 10 + 10 − 10 ≥ 87 43 1 − 1 + 51 2 − 2 + 176 3 − 3 + 34 4 − 4 + 80 5 − 5 + 5 6 − 6 + 136 7 − 7 + 56 8 − 8 + 67 9 − 9 + 68 10 − 10 ≥ 80 10 ∑ j=1 j = 1 j − j ≥ 0j = 1, ..., 10 j ≥ 0j = 1, ..., 10 Solving the above MILP model for evaluating DMU 5 leads to * = 1, * 3 = 1, * 4 = 0.61, * 3 = 0.53 . Comparing the solutions indicaates different results for DMU 5 in the two models. In Model (12), unit is classified as inefficient while in (13) it is efficient. This is a notable difference. The new MILP model is designed to compare the unit under evaluation with the integer-valued production possibilities. With regards to the integrality assumption, DMU 5 is compared with the continuous part of production set in (12) which is not necessarily feasible. This gives evidence to the fact that in applications where integer-valued data exist, applying appropriate production set is necessary and the new model provides more reliable results. (13) min s.t. 33 1 + 1 − 1 + 39 2 + 2 − 2 + 35 3 + 3 − 3 + 32 4 + 4 − 4 + 36 5 + 5 − 5 + 35 6 + 6 − 6 + 34 7 + 7 − 7 + 40 8 + 8 − 8 + 38 9 + 9 − 9 + 39 10 + 10 − 10 ≤x 22 1 + 1 − 1 + 54 2 + 2 − 2 + 76 3 + 3 − 3 + 84 4 + 4 − 4 + 87 5 + 5 − 5 + 43 6 + 6 − 6 + 33 7 + 7 − 7 + 45 8 + 8 − 8 + 23 9 + 9 − 9 + 54 10 + 10 − 10 ≥ 87 43 1 − 1 + 51 2 − 2 + 176 3 − 3 + 34 4 − 4 + 80 5 − 5 + 5 6 − 6 + 136 7 − 7 + 56 8 − 8 + 67 9 − 9 + 68 10 − 10 ≥ 80 36 ≥x Computed scores for the rest of the units are reported in Table 2 , where R-EFFI and I-EFFI stand for efficiencies with real and integer-valued data, respectively. Comparing the columns under R-EFFI and I-EFFI, we can find that seven out of units have different efficiency scores in these two approaches. Moreover, as the Table shows, although, DMU 5 is efficient in R-EFFI, it is inefficient in I-EFFI. This shows that the classification of efficient and inefficient units is quite different in these approaches. The new method brings more discrimination power into evaluation. Finally, the FDH model is run for the data set by ignoring the selective proportionality and integrality assumptions. The results are reported in the last column of Table 2 . As the results show, five out of ten units prevail as efficient. This indicates that the FDH model overestimates the efficiency of the units. Performance analysis and evaluation in educational parts such as schools and universities have been widely studied by researchers from different perspectives. Kadoić et. al. (2018) in their research considered an approach that includes problem identification, objectives of a solution, design and development, demonstration of the artifact, evaluation, and dissemination for strategic decision-making in higher education. They have used a design science research process and provided new decision-making with two components that are based on the analytic network process and social network analysis. Begičević et. al. (2010) considered two plans in higher education institutions. First, the authors have made a plan ready for those activities yield in the execution of a portfolio of projects at the institutional level. Second, those scenarios are important for decision on whether to start a new project application. In this scenario, it is also dealt with that which project to choose, in case of several project ideas and limited resources. Jablonsky (2016), presented a new method, for overcoming the disadvantage of existing models, that considered efficiency assessment of decision-making units within the whole production chain. Cordero et. al. (2016) presented a new method for educational performance assessment in Spanish. They have used the non-parametric free disposal hull model and provided a decomposition based on overall inefficiency between different components while considering the differences between public and state-subsidized private schools. Čampelj et. al. (2018) , provided a multi-attribute modeling approach for assessing the implementation of Information and Communication Technologies in schools. They claimed that the key feature of their study is utilizing qualitative value scales for attributes that do not have exact values. Cherchye et al. (2019) presented a unified method for assessing the productivity of secondary schools in the DEA framework. All of the previously studied DEA-based research, considered the constant or variable returns to scale models, and issues such as integrality of the data and selective proportionality are not considered. However, two important points must be considered in performance analysis in educational parts: factors such as the number of students and number of teachers are integer and real values are not allowed to these factors. Moreover, it is possible to have a subset of inputs such as sport training camps that are not proportional with a subset of outputs such as the number of students who won in sports competitions. These issues show that there is a need to modify the existing approaches in such a way that they can be applicable in these hybrid cases. In this section, an empirical application on performance evaluation of thirty high schools in Iran is presented. According to the Iranian constitution, education for all Iranian children and adolescents is free of charge, and the government has to provide education for all through the Ministry of Education. Iran's educational system had elementary, secondary, and pre-university courses. In the final years of the eighties, the educational system changed to 3-3-3-3. This system has generally changed to six years of elementary and secondary education. The first people trained in this system were born in the second half of the 79th and the first half of the 80 s who had the elementary sixth grade. They graduated at the end of the first high school, and this year they study at eleventh grade. Practicing and technical students require skills in vocational schools and institutes of labor and knowledge. Graduates from the pre-university period entered the university if they succeeded in the national examinations (entrance examination), which had been removed by changing the system and replacing it with the sixth elementary element. Student Olympiads are a test that is held annually at high school students' level. The purpose of this test is to boost talented Iranian student and selecting a multiplayer team to take part in international scientific Olympiads. At present, the student Olympiads in the country are held in eight literary: mathematical, biology, rehabilitation stem and medical stem cells, chemistry, physics, computers and astronomy, astrophysics and geography, among which the Stem Cell and Medical Stem Cell Olympiad was rebuilt. Other Olympiads are officially held and include exam facilities. 1 Based on the foregoing discussion, the performance of the Iranian educational system is an important issue that needs to be analyzed in dept. This section applies our proposed model to data of thirty high schools in Tehran. There are two main education types in Iran 2 : K-12 and higher educations. K-12 indicates the sum of kindergarten, primary and secondary education in the country which takes 12 years. K-12 education in Iran is under the supervision of the ministry of education 3 and higher education is supervised by the ministry of science, research, and technology 4 and the ministry of health and medical education * . 5 We investigated the performance of high school education systems in Tehran by comparing and contrasting our new approach by proportional DEA models with integer-valued input/output data and the tradidional FDH model. Moreover, we highlighted the efficiency scores and also target points with different approaches. Table 3 summarises the selected inputs and outputs for 30 high schools in Tehran. I 1 indicates the number of students in their third year of high school. This quantity shows the population density of each high school that obviously takes integer values. I 2 designates the number of specialized teachers for classes held in mathematics and physics. A specialized teacher needs to be employed for each specific course work that can teach and mentor students in their schooling. These teachers may also assist students in university entrance exams and the Olympiad selection process. I 3 specifies the number of sports training camps held in each high school to support those interested in sports competitions and activities. O 1 shows the number of students who passed all course work and graduated in their third year-inherently. O 2 is the number of students who succeded in university entrance exams plus the number of students who entered to the Olympiad competitions. O 3 indicates the number of students who won in sports competetions. The important issue here is the possibility of having a subset of inputs that is proportional with a subset of outputs. This would point out that alterations in the number of teachers and the number of students who are able to pass all the courses and graduate from their current level have the same portion as the number of students who pass university central exams and Olympics. Thus, these subsets of inputs and outputs are considered to be proportional to each other since factors such as 'the availability of sports facilities' are important. Therefore, I 3 and O 3 are not considered to be proportional to each other, and they are treated as non-proportional input and output. Table 4 exhibits the integer-valued inputs and outputs data for 30 high schools in Tehran. Table 5 provides the results of efficiency analysis while applying real and integer-valued models for evaluating efficiency scores. The results point out that the efficiency scores obtained from the new integer-valued model (I-EFFI) are always greater than or equal to those obtained from the traditional real valued model (R-EFFI). The maximum difference is related to DMU 3 where its I-EFF and R-EFF scores is 0.7632 and 0.7223, respectively. There are 13 efficient DMUs by our new integer-valued approach meanwhile this number for the R-EFFI approach is reduced to 9. To be more specific, DMUs 19, 20, 23, and 25 are in the efficient set in our approach while they are in the inefficient set in the real-valued approach. Taking integer-valued data and attempting to calculate non-dominated target points affect the analysis so that feasible solutions may not be achievable in some cases. This is not the case in the presented approach here because it is attempted to find suitable points within the corresponding PPS and not merely on the efficient frontier. These two approaches differ from each other as some units are efficient considering integer-valued data since they are evaluated as inefficient units when a model is not obliged to search for the integer-valued benchmark. Efficiency scores of these units exhibit a decline when respectively considered having integer and real data like DMUs 19, 20, 23, and 25 that cover 13.3% of the results. DMUs do exist that are inefficient in both analyzes but with different scores. These units did not manage to efficiently use inputs to produce outputs. As an important matter, these units should imitate the corresponding target units out of the set of DMUs under evaluation to make changes in their strategy in order to reach an efficient state. These units constitute 56.7% of all results. Around 30% of the high schools remain efficient in both analyzes. Restricting the analysis to integer-valued data does not have an effect on the operation of the units. As it may be observed, efficiency classification taking integer-valued data into consideration differs from real-valued data in the analysis. The average of efficiencies in real and integer analyses are quite close, however, 0.93338 and 0.940167 respectively. It is important to mention that these results are different in the interpretation and classification of efficient and inefficient units. Some units perform efficiently with real-valued data but not with integer-valued data are considered in the analysis. Thus, relevant managers would change their strategies. For inefficient units, target units need to be introduced to be followed and imitated (Table 5) . The last notable point is that the conventional proportional model may fail to produce target points in the presence of integer-valued data. Table summarizes the 1 113 10 14 88 51 23 16 110 10 19 94 75 7 2 133 12 20 112 75 13 17 95 7 18 89 71 5 3 114 9 17 68 61 12 18 136 8 17 112 93 12 achieved targets of real-valued data of Podinovski (2004) and the integer-valued data of Models (9) and (10). Some targets are identical by all three models and some others differ which have been highlighted. In the highlighted cells of Table 6 , the measured targets by the first phase Model (9) and the second phase Model (10) are presented within a pair of brackets and parentheses, respectively. As already indicated, initial input data have integer values. Therefore, it is rational that corresponding targets must also have integer values. This is not a given in the conventional HRS model. Comparing the two analyzes with real-and realvalued data of Model (9), DMUs may have similar or dissimilar efficiency scores. Some units such as DMU 3 and DMU 27 have different efficiency scores and unit DMU 29 has scored the same in both analyzes. In performance analysis, target setting (benchmarking) is also of great importance. It assists decision-makers in better performance analysis of inefficient units and suggesting improvement schemes. Targets of efficient DMUs are not scrutinized since they perform efficiently, and there is no need to introduce specific targets from among the sets of DMUs. A notable difference could be observed in target setting for inefficient units. For this group, the computed targets of 70% of the units in the HRS model are not applicable when considering integer-valued inputs. In accordance with obtained results for target units, it is observed that these results are not just obtained by rounding (up or down) the data values. Rounding the obtained results is not always an acceptable solution since in some cases it may lead to infeasible points. This rounding up may be verified in units such as DMUs 1, 3, 25, and 15. Consider relations in (11) for introducing non-dominated targets. The model presented by Kuosmanen and Kazemi Matin (2009) is used to find the efficient integervalued points of the PPS to introduce target (benchmark) units. After solving Model (9) the second phase Model (10) is used for calculating the benchmarks with integervalued components. Optimal slacks obtained from solving Model (10) are listed in Table 7 for DMUs with at least one non-zero slack. Moreover, Table 8 demonstrates the obtained input targets acquired from Podinovski et al. (2014) model when all data is assumed to be real. As can be easily verified, some of these points dominate acceptable integer-valued targets. It is rational and logical that targets not having integer data cannot be considered as acceptable targets when integer data exist in the analysis. Consider Table 8 and Model (9) for assessing DMU 1 . The observed input vector for this DMU is (113, 10, 14) and the obtained results show that input radial realvalued target points for this unit to be (104. 9286, 9.285714, 13) and the resulting suitable integer-valued target point as (104, 9, 13) . It may be noted for further discussion of results that input improvements are (0.9285682, 1.285714, 0) after non-radial improvement and solving the second phase model. Applying these non-radial changes to the real-valued input radial target, the integer-valued point (104, 8, 13) is obtained which is a better estimate than the integer-valued target point obtained from the first phase model. Additionally, it can be observed that by non-radial improvements, (0, 1.752632, 0) , the point (104, 7.247368, 13) is obtained from this model. This point is non-dominated since it is obtained from the second phase model of HRS Podinovski et al. (2014) , but it does not have integer values. Thus, it cannot be an acceptable target unit for the DMU under evaluation. Although, it dominates the integer input target and is located on the frontier. Consequently, (104, 8, 13) is the best integer-valued input target point. Model (10), the second phase model, maximizes the non-radial improvement and returns the optimal output slacks as (4.574, 22.795, 0.232) . For DMU 2 , the initial input data is (113, 12, 20) and the real-valued input target point is (122, 11.007, 18.346) when utilizing the first phase Model (9). Imposing limitations, the integer-valued input target point is (122, 10, 18) and considering Model (10), the second phase model, all the input and output slacks are equal to zero. In evaluating DMU 4 the input vector is (113, 11, 18) . Solving the first phase Model (9) leads to the real-valued input radial target (102, 9.929, 16.248) . The nearest dominated point is (102, 9, 16) . Input and output slacks are (1.929, 0, 4.248) and (0, 0.098, 1.358), respectively for non-radial improvements reaching the non-dominated point. Thus, input integer-valued target is (102, 8, 12 ) . If it is desired to find a non-dominated point, Model (10) produces slacks of (0, 1.454, 4.045) which lead to point (102, 7.546, 11.955) . This point dominates the integer-valued point obtained from the second phase model. However, it is not considered suitable as a target point due to it containment of real-valued input. It should now be clear that solving Model (10) is necessary since the second phase model can find better target points when assessing most DMUs with integer values. Some of the DMUs are efficient while being assessed by Model (10) and there is no need for finding corresponding target units for them as they perform efficiently. In obtained results for some inefficient DMUs, it is only possible to find radial integervalued input targets, and it is not possible to find improvements in inputs by non-radial changes. This indicates that the optimal value of the objective function of the second phase model for finding an integer-valued target point is zero. Finding reference points is of great importance in the efficiency evaluation of DMUs and target setting. Having this information, managers may establish a role model for their systems. By manipulating them, managers can reach an efficient states. As Podinovski (2004) stated, for any j where * j − * j > 0 , the corresponding DMU j is efficient. Therefore, referenced DMUs by this formula are detected from the analysis as presented in Table. As indicated in Table 9 there are 10 efficient units: DMUs 5, 6, 7, 9, 10, 17, 18, 19, 21, and 26 . In accordance with the basic definition of the reference set, the mentioned efficient DMUs have a positive value of * − * . Models (9) and (10) show radial and non-radial changes in the inputs. With these changes, it is possible to reach points with integer values that dominate the DMUs under assessment. As it can be seen, based on the optimal solution of the first and the second phase models listed in Table 8 , not all these points have integer values. Thus, although they dominate integer target points obtained from our proposed model, they cannot be considered as acceptable target units. For more substantial analysis of the results, norm L 1 of three input elements to their corresponding target points is calculated. Units with the smallest norm L 1 values are DMUs 20, 25, and 29. Corresponding norm L 1 value for all of these three units is 3 and the corresponding efficiency scores are 0.9961, 0.9643, and 0.99896 respectively. It may be stated that these units perform efficiently. The high efficiency scores indicate that it is rational for the corresponding norm L 1 to be small. DMUs 11, 13, and 28 have larger norm L 1 values of 35, 26, 21, and 37 respectively; their individual efficiency scores are 0.7223, 0.8181, 0.8485, and 0.7883. High norm L 1 values are relatively predicted by the efficiency scores, as mentioned above. This study compares variations of three inputs with their target points. Considering the norm L 1 values, it may be said that x 1 has more reduction after x 3 and then x 2 . Although the CRS and VRS technologies respectively imply safer assumptions and better discrimination, HRS technology is a more powerful tool for efficiency discrimination. Therefore, in order to increase the discriminatory power of DEA models, it may be more useful to consider HRS over conventional technologies. This may be particularly useful when inputs and outputs do not necessarily exhibit the same proportionality. The VRS technology is more limited than the HRS technology because the former technology is a subset of the latter one. Comparing average efficiency in HRS and VRS technologies, the former is 0.95 and the latter is 0.93. Note that that HRS is not a subset of CRS technology. However, there are sets in the sample that exhibit differing efficiencies that may be lower in CRS or HRS models. Note that simple rounding methods for obtaining integer-valued points may lead to dominated or even infeasible points. Thus, utilizing HRS analysis is more discriminating. We have also applied the FDH model to this data set. The results of efficiency scores are listed in the last column of Table 5 . The target points corresponding to inefficient units are given in Table 10 . All the computed efficiency scores are estimated tobe greater than 0.9000 and 24 units were classified as efficient, while only 6 units prevailed as inefficient. This clearly shows that the efficiency scores are overestimated. As a final word, it can strongly be stated that the proposed model is more useful than the prior arts in the presence of selective proportionality and integer-valued data. In real applications of DEA, we sometimes face cases in which some input/output variables are integer-valued and simultaneously, a subset of input variables is not proportional with a subset of output variables. Ignoring this important issue leads to a misestimating of efficiency analysis. This paper introduces an axiomatic foundation to construct a production technology set with selective proportionality assumption in the presence of integer-valued data. A novel MILP model is then provided to compute modified Farrell measure with respect to both HRS and integrality assumptions. A numerical example of an actual data set for 30 high schools in Tehran is utilized to demonstrate the proposed approach. This work demonstrates that efficiency obtained from the proposed model is more reliable than those obtained from conventional HRS models where integrality assumption is completely overlooked. Insignificant but notable differences are reported in setting targets for both conventional HRS and the proposed approach. It is illustrated that the proposed model is useful for benchmarking. Two potential research issues need worth mentioning. First, one can explore from our proposed model to incorporate selective convexity and provide Pareto-efficient targets. Second, one could think of extending the approach to study the nature of returns to scale and determining an optimal scale size for the observations. These may consider as interesting challenges in expanding our knowledge on the subject. Developing selective proportionality on the FDH models: new insight on the proportionality axiom Proportional production trade-offs in DEA Some models for estimating technical and scale inefficiencies in data envelopment analysis Decision-making on prioritization of projects in higher education institutions using the analytic network process approach A multi-attribute modelling approach to evaluate the efficient implementation of ICT in schools Measuring the efficiency of decision making units Measuring the efficiency of decision-making units A unified productivity-performance approach applied to secondary schools Multiple variable proportionality in data envelopment analysis Introduction to data envelopment analysis and its uses: With DEA-solver software and references. Introduction to data envelopment analysis and its uses: with DEA-solver software and references A comparison of public and private schools in Spain using robust nonparametric frontier methods The measurement of productive efficiency A step forward on order-a robust nonparametric method: inclusion of weight restrictions, convexity, and non-variable returns to scale Economies of scope in the health sector: the case of Portuguese hospitals Applying a hybrid DEA model to evaluate the influence of marketing activities to operational efficiency on Taiwan's international tourist hotels Efficiency analysis in multi-period systems: an application to performance evaluation in Czech higher education A technical note on "A note on integer-valued radial model in DEA A new method for strategic decision-making in higher education An integer-valued data envelopment analysis model with bounded outputs Theory of integer-valued data envelopment analysis under alternative returns to scale axioms Theory of integer-valued data envelopment analysis Discrete and integer-valued inputs and outputs in data envelopment analysis. In: Zhu J (ed) Handbook on data envelopment analysis Bridging the gap between the constant and variable returns-to-scale models: selective proportionality in data envelopment analysis Production technologies based on combined proportionality assumptions The hybrid returns-to-scale model and its extension by production trade-offs: an application to the efficiency assessment of public universities in Malaysia Combining the assumptions of variable and constant returns to scale in the efficiency evaluation of secondary schools Radial data envelopment analysis models with mixed orientation of input and output The authors are grateful the ORIJ Editorial Office to ensure timely revision process and their flexible approach during the COVID-19 pandemic. The authors are also grateful for constructive and helpful comments and suggestions made by anonymous referees. Appendix 1Proof of Theorem 1 The following items need to verify:a. T IDEA HRS satisfies the mentioned axioms. b. If T is an arbitrary technology set that satisfies the same axioms, B1 − B4 , then it contains T IDEA HRS . The first step is straightforward and a trivial verification points that T IDEA HRS involves the observed DMUs (B1) , satisfies natural convexity (B2) , natural disposability (B3) , and natural selective proportionality (B4) . Regarding the second step, consider integer-valued data and assume that T ⊆ Z m+s + be any arbitrary technology set that satisfies B1 − B4 . Let T = conv T ⊆ R m+s + be the convex hull of T . According to , T is still contains the observations and satisfies the continues versions of B2 − B4 . As it is shown in Podinovski et al. (2014) , the production set T is the smallest set that satisfies the mentioned axioms B1 − B4 . So, we have T HRS ⊆ T . Now, by restricting the vectors to the integer valued points, we obtainThe left and right sides of (6) can be considered as T IDEA HRS and T , respectively. This concludes that T IDEA HRS ⊆ � T , which completes the proof. Generalized model: In Sect. 4, we assumed that all input and output data can only take integer values. Now, we extend the proposed model to the case in which some inputs and outputs are real-valued. Suppose the input-output vector (x, y) is partitioned as ( x IP , x INP , x P , x NP , y IP , y INP , y P , y NP ) in which: The following input-oriented MILP model is proposed to analyze the relative efficiency of DMU k :It should be noted that m 1 + m 2 + m 3 + m 4 = m and s 1 + s 2 + s 3 + s 4 = s and m and s are respectively the total number of inputs and outputs. y P are real − valued proportional output s 3 − vector, y NP are real − valued nonproportional output s 4 − vector,