key: cord-0047434-pjv05p3h authors: Kostenko, Dmitri; Arseniev, Dmitriy; Shkodyrev, Vyacheslav; Onufriev, Vadim title: Pareto Optimization in Oil Refinery date: 2020-07-11 journal: Data Mining and Big Data DOI: 10.1007/978-981-15-7205-0_3 sha: e51f28800994f6175b31ee2c8bbe09cde0f0c60a doc_id: 47434 cord_uid: pjv05p3h This article describes the process of multicriteria optimization of a complex industrial control object using Pareto efficiency. The object is being decomposed and viewed as a hierarchy of embedded orgraphs. Performance indicators and controlling factors lists are created based on the orgraphs and technical specifications of an object, thus allowing to systematize sources of influence. Using statistical data archives to train, the neural network approximates key sensors data to identify the model of the controllable object and optimize it. Multicriteria optimization is a process of simultaneous optimization for two or more conflicting functions within one point [1] . The need to simultaneously optimize different key performance indicators (KPI) exists in both business and industrial production, such as oil refinement facilities. Oil refinement is an advantageous direction for multicriteria optimization because of its complexity. Optimisation process is preceded by decomposition [2] of the refinement sequence, which is followed by model identification [3] . Pareto optimality principle [4] is used in conjunction with aforementioned models to build a front of optimal values. Part of the refinement process takes place inside a refraction unit (RU). One of these units was taken as a prototype for our model. The model got the following KPIs assigned [5] : Quality (matching degree between output and established norms), Performance (output product volume), Efficiency (resource usage potency), Reliability (equipment failures per unit of time), Safety (emergencies per unit of time). Rectification consists of a wide array of parameters, up to several hundreds of characteristics per one refraction unit. These include the splitting section, column head and column plates temperature and pressure. Inside and outside of the rectification column both sequential (multi-layered raw oil refinement, raw oil heating, raw oil pumping, etc.) and parallel (vapor condensation) processes take place. Rectification technology also includes transition products into the process, thus applying an additional, horizontal level of hierarchy between the operations. It is also essential to account for the time delay, added by the inertia of the system itself and enforced by the continuous operating mode. Consequently, processes from the highest levels of the hierarchy have unobvious connections with the lower-level processes thus making it impossible to use simple functions like y = f (g, u) to represent dependencies between them. Yet it is essential to influence top-level processes by changing parameters of the low-level processes and vice-versa. In this work the aforementioned problem is resolved by decomposing a complex system (such as refraction unit) down to individual units and processes. The resulting structure is represented as a graph. The KPI set takes the top level of the hierarchy. Every KPI is divided into several summands of a lower hierarchy levels. The step is repeated until the summand can be unambiguously interpreted by the y = f(x) type of dependency. Dependencies are identified using a neural network trained on the RU statistical data archive. Going up by one hierarchy level changes dependency to y = g(f(x)). Ascending by the hierarchical tree allows to determinate a clean dependency between a KPI and an input parameter from the bottom of the hierarchy. However, the top-level key performance indicators may directly contradict each other. For instance, raising Performance by forcing aggressive operating parameters will inevitably cause the growth in equipment failures and an overall reduction of Reliability. Which in turn damages Efficiency of the refraction unit or a refinery as a whole. The "Good -Fast -Cheap" triangle encourages us to use a multicriteria optimization algorithm to balance out conflicting key performance indicators. The aim of this work is to perform a multicriteria optimization to find a Pareto optimal solution. This allows us to find a safe combination of controllable parameters able to keep the target indicators inside the given target intervals. This analysis is based on statistical data archive taken from a working refinery. It was used to build the graphical representations of dependencies between performance and temperature, characterising the lowest hierarchy level. In order to optimize the top-level KPIs by changing the bottom-level controllable parameters, a strong correlation must be revealed. To grant it, a dependence model identification has been performed [6] . The refraction unit consists of an oil pre-heater with a heat-exchange unit, a fractionating column, a refrigerator and a boiler. The pre-heated oil is injected into the column feeding zone to be divided into vapour and solid phases. During the rectification process isopentane is extracted from the top part of the column as a fractionator overhead. Heavier fractions are taken from the plates in the middle of the rectification column. The heaviest part, the long residuum, gets extracted from the bottom part of the column [7] . A simplified scheme, showing distillation inputs and outputs targeted by this work, is present on Fig. 1 . The stripper temperature (U 4 in Table 1 ) has been chosen for optimization. Temperature variation was aimed at maximizing fractions 240-300 and 300-350 output volumes (G 2 and G 3 in Table 1 ). To identify dependencies between the stripper temperature and fraction outputs (see Fig. 1 ), a neural network (NN) was utilized. It was trained on statistical data obtained during 24 h of the prototype column work. The neural network consists of 1 input, 1 hidden and 1 output layers. Input and output layers both contain one neuron, while the hidden layer contains 10. The NN was trained by the backward propagation of errors method. Hidden and output layers use a sigmoid activation function (1): The general formula for the NN is (2): Here f stands for sigmoid activation function (1) Application of the aforementioned neural network to input data allowed to identify dependencies between the stripper temperature and output volumes of 240-300 and 300-350 fractions (see Fig. 2 ). To verify correlation between the models (lines on Fig. 3) and statistical data (crosses on Fig. 2 ) correlation coefficients ρ x,y have been calculated. The left graph ρ x,y = 0, 76312 and the right graph ρ x,y = 0, 90781. Coefficient formula (3) is present below, σ x and σ y are the mean values of the corresponding selections. Correlation numbers could be improved by increasing the size of the training selection for the neural network. Amount of NN's hidden layers and/or hidden neurons are also subjects to change. But in order not to deviate from the main theme of the work, achieved correlations have been accepted. Having mathematical models of interconnections between the basic parameters of the oil refinement process in place, it is now possible to compare them and define the optimal points according to the multicriteria optimization method. Pareto optimality is a state of allocation of resources from which it is impossible to reallocate so as to make any one individual or preference criterion better off without making at least one individual or preference criterion worse off [8] . Pareto front within the range of target functions is a combination of solutions, which do not dominate each other, but dominate every other solutions within the search space at the same time. It means that it is impossible to find a single solution able to excel every other solution at reaching every target. Mathematically such problem can be formulated as follows: one must find a vector X * = x * 1 , x * 2 , . . . , x * Here X * ∈ R n is a solution vector; F(X ) ∈ R k is a vector of target functions every single one of which must be optimized [9] . Strength Pareto Evolutionary Algorithm 2 (SPEA2) [10] was used for Pareto optimization. Despite it's relatively old age, it is a well-tested algorithm, effective for select applications [11] , including more representative spread of non-dominated solutions [12] , and was chosen over others, including VEGA, FFGA [13] and NPGA [14] . SPEA2 algorithm can be summarized in 6 steps: • Step 1, Initialization: Generate an initial population P 0 and create the empty archive (external set) P 0 = ∅. Set t = 0. • Step 2, Fitness assignment: Calculate fitness values of individuals in P t and P t . • Step 3, Environmental selection: Copy all nondominated individuals in P t and P t to P t+1 . If size of P t+1 exceeds N , then reduce P t+1 by means of the truncation operator, otherwise if size of P t+1 is less than N , then fill P t+1 with dominated individuals in P t and P t . • Step 4, Termination: If t ≥ T or another stopping criterion is satisfied then set A to the set of decision vectors represented by the nondominated individuals in P t+1 . Stop. • Step 5, Mating selection: Perform binary tournament selection with replacement on P t+1 in order to fill the mating pool. • Step 6, Variation: Apply recombination and mutation operators to the mating pool and set P t+1 to the resulting population. Increment generation counter (t = t + 1) and go to Step 2. SPEA2 algorithm has been utilized to find a front of temperatures, maximizing the output of fractions 240-300 and 300-350. Previously identified models have been used as "experimental data" on the graph seen on Fig. 3. Fig. 3 . Pareto front As a result, we got a Pareto front, consisting of eight points (see Table 3 ). The points are sorted in descending order of preference, thus making the first point the most optimal. In the course of the work a refraction unit has been decomposed, and it's input, controllable and target parameters were extracted and listed. Based on statistical data analysis the neural network was trained and then used to identify dependency models between the RU parameters. The models allowed us to define eight points of Pareto front. We are planning on extend the sphere of practical application of this method. First of all, we'll have to decompose complete sets of dependencies from the basic controllable parameters up to top level KPIs. This would allow us to see key performance indicators of separate units and the whole refinery changing in real time. Further software development enables us to revert the process and guess controllable parameters, able to sustain a pre-defined set of high-level KPIs. Such instrument will not only allow to optimize complex processes, but will also ensure much more effective control over them. Multiple Criteria Optimization: Theory, Computations, and Application Pareto-optimality is everywhere: From engineering design, machine learning, to biological systems Digital twin applications: diagnostics, optimisation and prediction Multicriteria optimization of simulation models Key performance indicators in the oil & gas industry Shaping the digital twin for design and production engineering Oil refining. In: Ullmann's Encyclopedia of Industrial Chemistry Decisions with Multiple Objectives: Preferences and Value Trade-Offs Multi-objective optimization of trusses using genetic algorithms SPEA2: Improving the Strength Pareto Evolutionary Algorithm How effective and efficient are multiobjective evolutionary algorithms at hydrologic model calibration? Comparison of SPEA2 and NSGA-II applied to automatic inventory control system using hypervolume indicator Genetic algorithm for multiobjective optimization, formulation, discussion and generalization A niched Pareto genetic algorithm for multiobjective optimization