key: cord-0058168-toiiiy7w authors: Balyakin, Artem; Nurakhov, Nurzhan; Nurbina, Marina title: Current Trends of Megascience Facilities Utilization date: 2020-11-20 journal: Comprehensible Science DOI: 10.1007/978-3-030-66093-2_39 sha: c297eaf0e1d4f883428eabbeb6e424abe35ef5ec doc_id: 58168 cord_uid: toiiiy7w The paper considers the specifics of unique scientific facilities functioning at the present time. A number of Russian and European megascience facilities are considered. It is shown that currently megascience facilities utilization focuses at the maximum standardization and automation of processes, with the transfer of a significant part of the activity to the remote access format. These trends have received an additional impetus due to the quarantine regime during coronavirus pandemic. The necessary changes to business processes related to the operation of megascience facilities in distant mode are discussed. The role of e-Infrastructure (including the construction of digital twins) as an essential part of unique scientific facilities is revealed. The idea of a possible new form of international scientific research is formulated. Nowadays globalization process reflects itself in any aspect of life, with science being no exceptions. New scientific globalized knowledge is produced by joint research conducted by a consortium of scientists in specialized facilities (research infrastructures), the financing of the creation and operation of which goes beyond the capabilities of separate entities (with regard to countries, companies, institutions, etc.) [1] . Research infrastructures can be of various types: distributed (with its parts allocated in different organizations and/or countries); localized in one place (single-sited) or virtual (e-infrastructures). They may include scientific equipment, scientific collections, archives, databases or any other unique object that can be used for research purposes [2, 3] . Some studies also include methods and approaches of research in the definition of "research infrastructures", as the obtained results are tightly connected with the way of thinking, analyzing and handling the data [4] . In order to emphasize the uniqueness of the performed work, the term "megascience facilities" is often used as a complimentary to "unique (specialized) research infrastructure". As a rule, megascience facilities are international research complexes that have no analogues in the world. Despite their size and complexity of organisation they function as a whole, and are focused on obtaining scientific results that cannot be achieved at other facilities in the world. Institutionally, megascience facility can be defined as a scientific "supranational" organization with "independent representations" [5] or as organizational and managerial innovation [6] . Financing, construction and operation of such installations is implemented largely on the basis of international scientific and technical cooperation. Even in the case of the construction of such installations by one country independently (e.g., ISSI-4 in Russia, that will be the fourth generation X-ray source of synchrotron radiation in Protvino) their use goes beyond the capabilities of one country, and from the planning stage it is supposed to attract research teams from other countries. It should be understood that the results obtained at megascience facilities actually find their practical application in various fields. Thus, the development of a special legal algorithms concerning only and exclusively above mentioned technologies proves to be impossible. As a rule, several countries participate in megascience projects. This immediately causes a number of problems associated with the existing legal restrictions on the work of foreign citizens in the countries where the megascience facility is located, as well as the need (opportunity) to optimize certain operations, carried out in the process of working within the framework of a megascience project. The last also happens due to the need to optimize business processes in the provision of services using high-tech tools. The use of megascience facilities involves solving many related problems: from the technical maintenance of the project, to its methodological support. One of the pressing issues is the organization of scientists' access to research infrastructures andin the futureto the results of scientific research [6, 7] . In this paper (carried out as a part of the RFBR project No. 18-29-15015), we consider the features of the unique scientific equipment in remote access mode. This issue can be treated as a specific highly specialized service (product), and it plays crucial role for developing mass remote access technologies, not only for scientific purposes, but also for social tasks and commercial applications. The need for remote access to research infrastructure (including unique facilities), has existed since the advent of the technical feasibility of implementing such access. It is accompanied with a number of solutions highly demanded in various fields. For example, the growing trend towards the expansion of peripheral computing is based on the idea that local distributed traffic will reduce the delay in data transfer. Another idea is to attract as many participants as possible while expecting obtaining different solutions to one problem. This option (work on megascience facilities in remote access mode) was assumed in most installations initially, but quarantine restrictions associated with the spread of coronavirus infection in 2020 made the remote access format absolutely vital. In addition to solving the current tasks of carrying out research projects, the coronavirus pandemic has demonstrated the relevance of the task of providing remote access to research and educational infrastructure, as well as providing the possibility of remote work in principle, from a completely different perspective: this task should and can be solved in order to ensure public health safety across the globe. Thus, in the framework of this study, the following tasks were set: • to study the use of remote access techniques in the operation of megascience facilities. • to consider the novelty of remote access technology (in regard to megascience projects). How new and fundamentally changing the nature of research is the "remote approach" compared to traditional research methods? • to identify new trends in the work organization at megascience facilities, and to determine the required changes in business processes that accompany work at megascience facilities. Essentially, megasience projects are the program of concentration and effective implementation of intellectual property objects. Recent studies from the European Commission show that research projects in the field of ICT and energy, as well as grants from the European Council for Research and Development (ERC) for basic research, create a large amount of intellectual property, including patents, copyrights and trademarks [8] . In the Russian Federation, 7 megascience projects are planned or implemented at present time: NICA (Nuclotron-based Ion Collider Facility) complex of superconducting rings on colliding heavy ion beams; International Center for Neutron Research based on the PIK high-flux research reactor (ICNR PIK); Tokamak with a strong magnetic field (Ignitor); Accelerator complex with colliding electron-positron beams "VEPP-5 Complex" (Super Charm-Tau Factory); International Center for Extreme Light Field Research (CIES); Fourth Generation X-ray Synchrotron Radiation Source (ISSI-4); Siberian Ring Source of Photons (SKIF) [6, 9] . Our studies focused on the possibility of implementing international experience at ICNR PIK. The PIK reactor will serve as a powerful source of neutrons that slow down to the required energy, and are removed from the reactor through special channels and transported through the neutron guide system to experimental facilities for research. In its parameters and experimental capabilities, the PIK reactor surpasses all existing research reactors, including the world's only analogue -the HFR reactor at the European Center for Neutron Research -International Institute Laue Langevin (ILL, Grenoble, France). By 2025, it is expected to commission 20 major appliances. The construction of the ICNR PIK is carried out in Gatchina, Leningrad Region, on the basis of the St. Petersburg Institute of Nuclear Physics named after B. P. Konstantinov, a unit of the Kurchatov Institute [10, 11] . Also, the NRC Kurchatov Institute, on behalf of the Russian government, provides scientific guidance in a number of foreign research projects. For example, this is the ITER (International Thermonuclear Experimental Reactor), the European Radar Free X-ray Laser (Eu-XFEL), and the Large Hadron Collider (LHC) at CERN. Despite the high role of fundamental research in megascience facilities activities, there are also significant prospects for their practical implementation (mostly medicine and material science). Thereby, synchrotron sources facilities contribute to various sectors of the real economy [9, 12] . The use of super-bright X-ray beams in the study of matter allows us to see how extremely complex systems (e.g., proteins) are arranged, how energy is released in living cells (e.g., human brain cells), how new artificially produced material function (so called metamaterials are examined by the powerful beams of different nature), how nanoparticles move and interact [13] . The results obtained at megascience facilities are used by research teams around the world. Regarding the data gained while working on research infrastructures, 2 approaches can be noticed. As part of the first approach (performed, e.g., in CERN), an open data policy is implemented when the scientific results obtained by the collaboration are published freely and are disseminated in the public domain. At the same time open access to raw (unprocessed) data is not expected. The data itself (in processed format) is stored for a long time and is available for re-analysis. The second approach, implemented in European XFEL, is the exemption from the policy of open scientific data for private research. All raw data, and associated metadata, as well as raw data analysis results obtained from private research, will belong exclusively to the client, who has gained access and is not subject to the European XFEL Scientific Data Policy. Whereas the results of purely scientific experiments are available after a period of embargo (3 years). The processed data and the results of the interim analysis and the associated metadata are not considered by EuXFEL as long-term storage facilities (5 years or more). Totally, regardless of the approach adapted at the facility, the research infrastructure produces the excessive amount of data (we recall here that "big data" originated from CERN experiments). The large volume of the obtained results, the need for their comprehensive processing, storage and analysis have led to the fact that working with data has been removed from the general list of activities and has become a separate factor in the megascience facilities operation. After minimal initial processing, the information collected was being sent for analysis to the data centers of the project participants. The process of obtaining data on megascience facilities (taking CERN as an instance) can be represented in general form as follows [14] . During the experiment, data is generated by detectors. Data from the detectors arrives into a temporary storage, where preliminary data can be pre-filtered. Temporary storage is usually a system with fast access of a limited volume, which must be freed up to receive the next portion of experimental data at first demand. Data from temporary storage is moved to permanent storage, and additional processing of data may or may not occur depending on the experiment. The location of the permanent storage is not tied to a specific installation, and can be anywhere in the world. Data transferred to persistent storage is processed and analyzed to obtain on-site results (without involving specialists from the original megascience facility). At all stages of obtaining and primary processing, access to data belongs (usually) only to members of a limited team of scientists participating in the experiment or members of the collaboration. The organization of these works required the optimization of the relevant business processes and the consideration of the specifics of remote access in title documents. Thereafter, in the case of CERN, back in 1998, the MONARC project (Models of Networked Analysis at Regional Centers for LHC Experiments) was created, one of the results of which was the concept development of hierarchy of processing centers, modeling and data analysis. There are currently 4 levels of processing centers. At the bottom there is the zero level (Tier-0 CERN Computing Center), which is engaged in the primary reconstruction of events, calibration, permanent storage and archiving of a complete set of raw and simulated data. Next come Tier-1 (13 centers), Tier-2 (about 170 centers), and finally, Tier-3 (about 50 centers), which are university clusters, or centers that provide resources on a voluntary basis, physical data analysis. Tier-1, 2, 3 facilities are located all over the world, and connected within the special net. A similar data policy was implemented in the case of the global neutrino network (GNN) [15] . This network can be interpreted as a distributed scientific infrastructure, the elements of which can be individual research infrastructure and collaborations (for example, IceCube collaboration -47 organizations from 12 countries or the Dubna multi-megaton deep-sea neutrino telescope). As infrastructure GNN is a set of experimental facilities, and a number of scientific institutions (institutes, universities, research centers, etc.) distributed throughout Europe, America and Asia that are engaged in the setting up of experiments and processing of the obtained data. Thus, in this case we are talking about the existence of a unique scientific facility (neutrino telescope), which implements a wide research program in remote access mode (sometimes in real time). Neutrino telescopes can also be considered as sources of information that is distributed within the network between its participants for further processing. Without exception, all unique research infrastructures include the following main components: physical (actually a complex of scientific equipment); digital (or informational) infrastructure (including distributed one), which provides the research and development activities of the installation. Thus, the activity on megascience facility can be divided into 2 components. In the first aspect, this is the solution of engineering (applied) problems that arise during the creation, operation and modernization of the installation. In the second, "scientific" aspect, the facility turns to perform the solution of scientific and practical problems, setting up an experiment and analyzing the data obtained. In fact, the second definition can be attributed to the "research infrastructure" in a broad sense. As part of the work in the first aspect, the physical implementation of the experiment and its support (installation, configuration, repair and commissioning of equipment, maintenance of activities) are carried out. Activity on the second aspect is the collection and processing of the results. To solve the first problem, in the immediate vicinity of the installation there is a group on duty, which is responsible for ensuring the operability of the facility (see examples on CERN above). The scope of tasks of this group is limited, and mainly consists in starting and, if necessary, setting up the installation software [11] . That is, this group does not directly interfere with the installation itself. In the work of all astronomical projects (Square Kilometer Array, GNN, SETI), the second part is more important and noticeable. Distributed analysis of data obtained from installations is carried out remotely by scientific groups within their work places using their own computing power (for example, as described above for GNN). For example, in the international Boreksino project [16] , an additional, independent data collection system (implemented by NRC Kurchatov Institute) based on fast waveform digitizers operates, in order to expand the dynamic range of the spectrometric measurements of the detector in an area inaccessible to the main electronics to *60 MeV. The complex allows, inter alia, collecting data remotely without the need for scientific teams to travel to the location of the detector. It can be seen that remote access is a necessary element of many mega-science facilities, which has firmly entered into scientific practice. The development of digital infrastructure gave rise to the new phenomenone-Infrastructure. Its creation was forced by the aims to optimize the remote work, and maximize the involvement of the scientific community. Another aspect that was not considered crucial at the beginning, was the commercialization of the results, that was much facilitated by introduction of the e-Infrastructure. We note that digital infrastructure does not duplicate and does not replace the "physical" one. Noteworthy to stress that this approach (digitalization of the scientific research) is a method of organizing modern scientific research. For example, for the European Union, the main trends in this area are related to the formation of a single scientific space based on open access and the formation of the e-Science system. The first step should be the unification of science, data collection systems and access to them. e-Infrastructure is thought to be the tool for implementing EU policies in science, when the achievements of the Internet, grid systems, cloud computing and databases are assembled in a new infrastructure. The first step in the development of digital infrastructure in the EU was the creation of the open scientific portal EOSC (European Open Science Cloud), launched in 2016, in order to increase the growth potential of the EU digital economy. Based on the EOSC work experience, the Go FAIR initiative is being prepared to put ideas and proposals related to digital science into practice [17] . To solve the applied problems of processing scientific information in the EU, a number of specialized data processing projects are planned: GEANT (management of scientific and educational network projects), EGI -Advanced computing for research (providing calculation options for CERN, EMBL projects), PRACE (providing computing power, 465 projects at the moment). In addition to the above, in order to codify and standardize digital infrastructure, within the framework of the Horizon 2020 program, the European Union launched the e-Standards project [18] . It is expected that the first consumers of e-Infrastructure will be representatives of the natural sciences, however, the greatest impact and the most significant results will be manifested in the field of humanitarian knowledge, which forces participants in the process to develop appropriate assessment methods and approaches today, simultaneously solving legal issues. It can be seen that the digitalization of scientific research, the high role of remote access to megascience facilities and/or to data obtained from unique research infrastructures, is a long-term trend that has been under development and implementation for a long time. Its actualization in 2020 due to the coronavirus pandemic seems to be only the intensification in existing dynamics. However, in order to take into account these changes and new requirements, it is necessary to introduce a number of modifications to some business processes. Hereafter we discuss some ideas that can be implemented in order to improve current practices. For an appropriate analysis, it is advisable to consider a project of "megascience" class as a high-tech tool designed to solve applied problems and provide research services. In this regard, the organization of remote activities, in our opinion, requires taking into account the following points: Firstly, it is necessary to minimize the need for the customer to be present at the facility where the megascience facility is located. It includes: • remote approval and conclusion of contracts, adoption of digital signature. Signing additional agreements or adaptation of previous ones; • delivery of materials for research from verified (approved) suppliers (which means, inter alia, the possible rejection of their own experimental samples). Formation of relevant specialized delivery service, standardization and examination services; • transfer of most of the research work at the facility to certified personnel (permanent staff of the facility). Development and approval of work algorithms Secondly, it is necessary to develop an appropriate digital infrastructure. This means both the creation of new digital infrastructure objects (data centers, processing algorithms, etc.), and the integration of megascience facilities into existing e-Infrastructure elements. Thirdly, legal and methodological support for the operation of megascience facilities in remote access should be provided. So, to ensure a remote format of the research infrastructure, the creation of an appropriate information system is required. Some aspects of creating such a system were considered by us earlier: taking into account socio-economic aspects [19] , the need to create a system of information support for the circulation of intellectual property objects [20] , and taking into account risks and challenges in ensuring national security [21] . Fourth, it is necessary to train engineering personnel to solve the related tasks of maintaining and ensuring the remote access mode. One of the solutions to this problem is the use of digital twins of existing projects. Such super-powerful projects could solve several important tasks at once: reconstruction and restoration of various physical processes with unverified characteristics; ensuring the participation of a wide range of scientists in long-term experiments in real time; a fundamental reduction in the risks of conducting dangerous experiments or experiments that could adversely affect the environment; use of event generators in the learning process; the ability to adjust technical tasks for the construction of real megascience facilities; the ability to reduce costs in the construction of real megascience facilities. Currently, data obtained by computer-generated events are already used in conjunction with physical installations on some megascience facilities. They usually solve some local problems of specific experiments. For example, ATLAS (conducted in CERN) uses data obtained by computer-generated event generation (by Monte-Carlo method). Corresponding data set (EVNT) is generated on the computing resources of any type of site. It all depends on which sites the tasks were sent to generate this data. After generating EVNT data, they are copied to other sites for subsequent processing, which, in essence, is similar to processing raw data [22] . Obviously, the creation of a complete digital twin of a mega-science project will be a mega-science project by itself and will require the infusion of comparable amounts of funds and intellectual activity. It is assumed that to create and support the work of digital counterparts, it is possible to attract business participants (that would also reduce associated costs). The creation of a digital twin will also make it possible to solve the problem of providing a remote work format for a large number of specialists in the scientific and educational sphere. Such democratization of technology will provide easy access (including one for non-specialists) to knowledge in the field of technology and business without long or expensive training. This policy, dubbed "citizen access", already finds its application in the development of applications, data and analytics, design and knowledge [23] . To ensure the continuity of the educational trajectory for students (both primary and high school), it is proposed to use existing game shells, launching add-ons with the possibility of modeling the studied processes. Minecraft game shell seems to be a suitable tool for digital megascience world construction. Fifthly, it is necessary to conduct a separate study of the risks and challenges that accompany the active transfer of work at megascience facilities to a remote format. In our opinion, the main problem will be related to ensuring the safety of the transfer and use of information and maintaining reputation (improving the reputation of companies collecting, transmitting and processing data and research samples). These two difficulties are derived from the high rate of evolution of technological progress that creates a crisis of confidence. This trend requires focusing on the key elements of trust: honesty, openness, accountability, competence and consistency. At present, there is an understanding of the importance of this issue, but there is a delegation of decision-making in favor of the same digital services that they serve. So, many states and business players use elements of artificial intelligence to make decisions and there is a temptation to use them in the case of scientific data. It seems that the general direction will be to take into account the social, moral and ethical components in this matter, which will inevitably lead to the adoption of a number of regulatory acts similar to the European Union General Data Protection Regulation [24] . In general, the current trends in the use of megascience facilities continue the previous scenarios, and the coronavirus pandemic acted as a catalyst for existing trends. We expect an increase in the importance of digital technologies, recognition of the role of e-infrastructure (as compared to "physical" one) and a gradual transition from solving technical issues (access, processing and transmission of information) to understanding of their socio-economic consequences [25] . This implies a detailed study of both the legal mechanisms of digital technologies and the consideration of moral and ethical risks and challenges associated with accelerated digitalization in combination with a remote format of work in the scientific field. Our analysis shows that megascience facility in nearest future will consist of both digital and "physical" components. E-infrastructure will become an inevitable part of any scientific project, and should be treated not as an addition or ersatz to the real one, but as an integral part of the facility as a whole. In practice, megascience facilities will always have their digital twins. Ad hoc this process will be accompanied by an enhanced role for e-Infrastructure as a response to current challenges. The preference for a remote interaction format and at the same time the requirement for uninterrupted operation of installations will lead to standardization of activities. From the point of view of scientific research, megascience facilities, converted to the remote access format, will lose their uniqueness and become a common place. This fact means that they will inevitably be replaced by a new generation of new unique scientific attitudes. A high degree of interconnectedness and inclusion in science and beyond will lead to the formation of a new over-infrastructure (super facility). It is expected that the basis of such a structure can be elements of the scientific infrastructures of ESFRI in the EU, a global network in the Russian Federation, or GRAIN project elements in BRICS. Now it is not possible to predict what new unique scientific settings requiring the personal presence of researchers will look like. However, it is clear that scientific and technological progress is impossible without taking into account the interaction of man, society, science and technology. As one of the options, the converged technologies approach can be considered (a.k.a. NBICS technologies) [26] . In the Russian Federation, work is already underway to integrate ICNR PIK into the digital future. Currently, the prospect of creating a unified research infrastructure of the Union State (primarily based on the megascience facilities), is being discussed with a view to consolidating resources and expanding opportunities for scientific research. Additionally, it is planned to widely involve third-party organizations (NAS of the Republic of Belarus, German scientific institutions and funds, etc.) to participate in the implementation of projects to create research infrastructure of the mega-science class (first of all, to participate in work at synchrotron-neutron research centers in the Russian Federation, including the International Center neutron research based on the PIK high-flux reactor, and others). The convergence of science and technology will bring together scientists from different countries and will facilitate the coordination of research and development aimed at overcoming global challenges, including the development of methods to prevent the spread of pandemics based on genetic research. FP7 Interim Evaluation, Analyses of FP7 supported Research Infrastructures initiatives in the context of the European Research Area Social benefits and costs of large scale research infrastructures Challenges and new requirements for international collaborations Conceptual foundations of the study of megascience as an organizational and management innovation Professional qualifications' recognition and megascience projects Features of development of european science: the program horizon 2020 Mega-science facilities as an important instrument for synergy of world level education and science The aspects of a draft model of international scientific and technical cooperation of the international center for neutron research PIK Life cycle of the project of creation and operation of the megascience facility Fermilab: Physics, the Frontier and Megascience Silica coated hard-magnetic strontium hexaferrite nanoparticles Borexino collaboration): comprehensive geoneutrino analysis with Borexino Dutch Techcentre for Life sciences Security issues of scientific based big data circulation analysis Integrity of Innovation Management and INSO Inventory Big data technologies as a tool for ensuring national security Monte Carlo generators in ATLAS software Top 10 Strategic Technology Trends for 2020 The EU General Data Protection Regulation (GDPR) Big data: nil novo sub luna Social aspects of big data technology implementation Acknowledgments. The authors are grateful to Taranenko S.B. for useful discussions. This work was supported by the RFBR grant No. 18-29-15015.