key: cord-1054503-krv7grwo authors: Lawrence, J. M.; Fernandes, P. G. title: A typology of North Sea oil and gas platforms date: 2022-05-16 journal: Sci Rep DOI: 10.1038/s41598-022-11975-2 sha: d25d4853dd9a6a9b84a93841ee7b9f866277b0d7 doc_id: 1054503 cord_uid: krv7grwo Since the commercial exploitation of marine oil and gas reserves began in the middle of the twentieth century, extensive networks of offshore infrastructure have been installed globally. Many of the structures are now nearing the end of their operational lives and will soon require decommissioning, generating renewed interest in their environmental impacts and in the ecological consequences of their removal. However, such work requires selection of a subsample of assets for surveying; censuses of the entire ‘population’ in any given jurisdiction are practically impossible due to their sheer number. It is important, therefore, that the selected sample is sufficiently representative of the population to draw generalized conclusions. Here, a formal clustering methodology, partitioning around medoids, was used to produce a typology of surface-piercing oil and gas platforms in the North Sea. The variables used for clustering were hydrocarbon product, operational state, platform design and material, and substructure weight. Assessing intra-cluster variability identified 13 clusters as the optimum number. The most important distinguishing variable was platform type, isolating floating platforms first, then concrete gravity-based and then fixed steel. Following clustering, a geographic trend was evident, with oil production more prevalent in the north and gas in the south. The typology allows a representative subset of North Sea oil and gas platforms to be selected when designing a survey, or an assessment of the representativeness of a previously selected subset of platforms. This will facilitate the efficient use of the limited funding available for such studies. The North Sea is currently covered by legislation which states that (with some exceptions) 'the dumping, and leaving wholly or partly in place, of disused offshore installations with the maritime area is prohibited' 31, 32 , and so decommissioning will involve the complete removal of these installations and associated infrastructure. As well as the large financial costs associated with the break-down and removal of these structures, there will also be potentially significant environmental impacts of the decommissioning process, through seabed disturbance 33, 34 , potential contamination risk 33, 35, 36 , and indeed, through the removal of the habitat and opportunities afforded to the local flora and fauna by the structures' physical presence 37 . This has led to renewed interest in studying the ecological and environmental impacts of these structures. For regulators to make informed decisions about decommissioning options, more information is needed about the roles played by these structures within the North Sea ecosystem, and the potential impacts of their decommissioning and removal. To this end, many environmental studies have looked to describe or investigate the biology and ecology of these systems, with many more currently underway (e.g. INSITE, www. insit enort hsea. org). However, one thing that is rarely considered, or can be limiting to the broad utility of a study's findings, is the selection of a representative sample of structures at which to collect data. With such a large number of offshore oil and gas assets in the North Sea, it is practically (and, normally, financially) impossible to conduct in-depth sampling at all locations, e.g. sampling at every single surface piercing platform throughout the region. The environmental interactions and ecological impacts of two different structures may be vastly different, however, due to differences in their physical shape, size and design. As such, it may be impossible to extrapolate the findings of a study conducted on only a small number of a single type of platform, to other platform types and locations, and so the conclusions may not be useful when considering ecosystem-wide management planning. To enhance the applicability of the findings of such studies, ensuring efficient use of the limited funding available (by eliminating the need to repeat the work for a different type of structure), it is essential that a representative sample of structures is selected. Alternatively, if a subsample has already been selected and surveyed, it is important to understand how representative the subsample is of the wider population, so that any limitations of the conclusions can be acknowledged. To a) select a representative subsample before data collection or b) assess the representativeness of a previously selected subsample from the population of North Sea oil and gas platforms, a formal typology is required, whereby platforms are classified into clusters based on common characteristics. This will mean that, based on the relevant variables selected on which to base the clustering, variability within clusters is much lower than variability between clusters. The relative split of the population between clusters can then be used to either select a representative sample, or to assess the representativeness of a previously selected sample. To create a formal typology, a comprehensive list of the items to be clustered [platforms in this case] is required, along with the corresponding complete dataset of variables on which the clustering will be based. Here, the OSPAR inventory of offshore installations 38 was used for both the list of the 'population' of offshore platforms (n = 552) and the variables of interest to be used for the clustering. The variables selected were: hydrocarbon product, platform type, operational status, and substructure weight. Other variables were considered (e.g. water depth, whether the platform is manned or unmanned, latitude and longitude, and produced water disposal method), but were not included for reasons given later, see "Discussion"). These variables include both categorical and continuous data, and so it was necessary to select a clustering methodology that performs effectively with mixed datasets. Partitioning around medoids 39 (PAM) has previously been used for clustering with mixed categorical and continuous data, for a wide variety of applications, including, for example, identifying the psychological effects of COVID-19 40 , clustering fishing vessels into discrete fleets 41 , grouping Indonesian districts for priority for intervention to address stunting 42 , grouping estuaries by a range of biotic and abiotic factors 43 , grouping similar patients presenting with back pain 44 , and identifying different fishing tactics from catch composition 45 among others 46-48 . Prior to the execution of a clustering algorithm, some measure of the distance between individuals is required, based on the variables selected. Here, a Gower distance matrix was used 49 , due to its utility with mixed categorical and continuous data. Gower distance is calculated as an average of the distances between two individuals calculated for each variable being considered. If the variable is continuous, a standardised difference is used (absolute difference divided by the range), and if the variable is categorical, the distance is 1 if the individuals differ, or 0 if they are the same. One drawback of the Gower distance metric is that it is sensitive to outliers and non-normality of continuous variables. Consequentlly, due to the significant right-skewness of substructure weight, the data from this variable were log-transformed to approximate normality; a log(x + 1) transformation was used due to the presence of zeroes in the data (e.g. from the floating structures). The PAM algorithm applies the following steps, based on the Gower distance matrix, to assign a population of n individuals to k clusters: 1. Assign k randomly selected individuals as cluster medoids. 2. Assign all remaining n-k individuals to the cluster with the most proximate medoid. 3. Reassign as medoid the individual in each cluster which would yield the lowest average distance for that cluster. 4. If a change is made at step 3, return to step 2. In order to select the optimum number of clusters, the average silhouette width of the population was calculated when arranged into 2-25 clusters. Silhouette width is a measure of the closeness of each individual to where for individual i, s(i) is the silhouette width, a(i) is the average dissimilarity from other members of i's assigned cluster, and b(i) is the average dissimilarity from the members of the nearest neighbouring cluster, i.e. the minimum average dissimilarity between i and the members of each of the other clusters to which i was not assigned. The algorithm was applied using the 'cluster' package in the R statistical programming language 50 . Examining the average silhouette width revealed 13 to be the optimum number of clusters. These clusters, as assigned using the PAM algorithm, can be characterised using their medoids as an exemplar individual from the group (Table 1) , similar in interpretation to the median of the group. Using complete-linkage clustering, it is possible to build a dendrogram using the separation between the medoids hierarchically based on their Gower distances, to show the how clusters relate to one another in distance (Fig. 1) . The most important variable for differentiating clusters was structure type (floating or fixed steel, concrete); the two largest splits separate out first the floating platforms, then the concrete platforms. Examining the spatial distribution of the various clusters, the most obvious spatial trend is a north-south split of oil and gas respectively (Fig. 2) . A formal typology of the oil and gas platforms of the North Sea was created, classifying the 552 individual platforms into 13 clusters. With this typology, and the relative numbers of platforms in each cluster (Table 1) , it is possible to select a representative subsample of structures as part of the survey design process for a study which is unable to visit the entire population of platforms. Alternatively, if a subsample has already been selected or sampled, or a survey designer does not have complete freedom to choose which platforms can be surveyed, the representativeness of a sample can be assessed, and so the applicability of the results to the wider population can be highlighted. The variables selected here were relatively basic, dealing only with some aspects of the platforms' physical size and structure, as well as the hydrocarbon product. For each specific application, a set of variables which are likely to be important in the context of the ecological question being asked should be selected, where available. One difficulty in this, is that the currently available publicly accessible databases (e.g. the OSPAR and OGA databases) are lacking information on some important variables, are incomplete in their records of others, and indeed are inaccurate in yet others. For example, for a study of fish around oil and gas platforms, there are factors relating to substances discharged from the platform which may affect the fish populations below them. These included whether the structures are normally manned or unmanned (and so have discharge of organic matter in the form of kitchen waste and black-and grey-water), and whether the platform is permitted to discharge produced water (formation water extracted along with the hydrocarbon product and process chemicals) or reinject it back into the reservoir. These data, however, are not included in any public database, and so would require a significant data-mining effort to collect for the entire population, something which was beyond the scope of this study. It may be possible to gather data on these variables for a small selection of platforms (e.g. by contacting the operators directly) and so they could at least be reported as a factor which may affect the ecology of the platforms, even if their comparability with the wider population of unsampled platforms is unknown. www.nature.com/scientificreports/ There are also transient variables which can differ temporally at any given platform but may impact the surrounding environment, particularly mobile species which can vary their distribution over short timescales (whereas sessile organisms cannot). For example, activities such as drilling will emit noise and vibration into the surrounding water, but only whilst they are actively occurring. These activities can vary over a range of timescales, but can extend up to several months at a time of activity or inactivity. While these variables might be impossible to include in a typology (due to both their highly transient nature and the amount of data gathering required for their inclusion, as mentioned above), it is essential that they be considered as important contextual information which may bias the data collected at any given time and location. Some variables have been deliberately omitted following consideration of their relative importance to the 'definition' of each cluster. Water depth, for example, could have been included due to the potential influence it would have on the ecology of the system around a platform [51] [52] [53] [54] [55] [56] . Additionally, platform location (latitude, longitude, or both) could have been included in the clustering process, as they will affect the ecology of the site 57, 58 . These variables were omitted, however, because they are more descriptors of the environment itself, than of the platform. It was decided therefore, that only information about the platform itself would be used for clustering, and the environmental variables can be controlled for (or investigated) as part of the survey design or data analysis of the environmental study. For example, it will be important to look at the distribution of water depths in each cluster, post-hoc, and ensure that representative samples (in particular in the event of a bimodal distribution in the depth data of a given cluster) are selected. One thing that became apparent over the course of this study is the need for high quality, accurate, publicly accessible databases to be maintained, so that the sort of analysis carried out here can be conducted for future studies using case-appropriate variables for clustering. Much of the information resulting from ecological studies of oil and gas infrastructure may be limited by the number of platforms sampled and a lack of clarity over the respresentativeness of the subsample selected. The current readily accessible databases, while a useful starting point, are limited in the number of potentially ecologically relevant variables they contain, and there are some issues with the accuracy and maintenance of some of the datasets contained therein (e.g. the location data in the OSPAR inventory of the offshore installations contains numerous inaccuracies). A typology of oil and gas structures in a given study area (here, the North Sea) is essential for selecting a subsample which is suitably representative of the wider 'population' . This will increase the extent to which the conclusions drawn from a study can be generalised, allowing the more efficient use of limited resources available for such studies. The work highlights the need for high quality, accurate databases of information about offshore oil and www.nature.com/scientificreports/ gas infrastructure to be maintained (including a range of relevant variables) so that a similar typology can be created using any and all characteristics deemed of importance to a new study. The data analysed in this article are freely available online from the OSPAR Data and Information Management System (https:// odims. ospar. org/ en/). Map of oil and gas platforms in the North Sea. Symbols denote the cluster to which the platform was assigned during the clustering process, and are coloured by hydrocarbon product. The clusters are designated as structure type_status_product, with the abbreviations being used: Fi, Fl and Co for Fixed steel, Floating steel and Concrete gravity base; Op, Cl and Deco for Operational, Closed down and Decommissioned; and Oil, Gas and Con being Oil, Gas, and Condensate. The map was generated using the 'maps' and 'mapdata' packages in R (v4.1.2; https:// www.r-proje ct. org). Identifying the consequences of ocean sprawl for sedimentary habitats The introduction of coastal infrastructure as a driver of change in marine environments Scour repair methods in the Southern North Sea Epibenthic colonization of concrete and steel pilings in a cold-temperate embayment: A field experiment Fish and sessile assemblages associated with wind-turbine constructions in the Baltic Sea Urban structures as marine habitats: An experimental comparison of the composition and abundance of subtidal epibiota among pilings, pontoons and rocky reefs Sessile Marine invertebrates of Beaufort, North Carolina: A study of settlement, growth, and seasonal fluctuations among pile-dwelling organisms Offshore windmill farms: Threats to or possibilities for the marine environment From sessile to vagile: Understanding the importance of epifauna to assess the environmental impacts of coastal defence structures Coastal man-made habitats: Potential nurseries for an exploited fish species, Diplodus sargus (Linnaeus, 1758) Potential use of marinas as nursery grounds by rocky fishes: Insights from four Diplodus species in the Mediterranean Evaluating the effects of protection on fish predators and sea urchins in shallow artificial rocky habitats: A case study in the northern Adriatic Sea Meals on wheels? A decade of megafaunal visual and acoustic observations from offshore oil & gas rigs and platforms in the North and Irish Seas Distribution of bocaccio (Sebastes paucispinis) and cowcod (Sebastes levis) around oil platforms and natural outcrops off California with implications for larval production Artificial habitats host elevated densities of large reef-associated predators High biodiversity oases in a low biodiversity environment Effects of an offshore oil platform on the distribution and abundance of commercially important crab species Modeling fish production for southern California's petroleum platforms Is there a net benefit from offshore structures The ecology of benthopelagic fishes at offshore wind farms: A synthesis of 4 years of research What is the relative importance of phytoplankton and attached macroalgae and epiphytes to food webs on offshore oil platforms Oil platforms off California are among the most productive marine fish habitats globally Coral recruitment and early benthic community development on several materials used in the construction of artificial reefs and breakwaters Impacts from partial removal of decommissioned oil and gas platforms on fish biomass and production on the remaining platform structure and surrounding shell mounds Mobile demersal megafauna at artificial structures in the German Bight: Likely effects of offshore wind farm development Current and projected global extent of marine built structures A global map of human impact on marine ecosystems The location and protection status of earth's diminishing marine wilderness Costing and technological challenges of offshore oil and gas decommissioning in the U.K. North Sea Worldwide oil and gas platform decommissioning: A review of practices and reefing options OSPAR's exclusion of rigs-to-reefs in the North Sea Ministerial Meeting of the OSPAR Commission, OSPAR Convention for the Protection of the Marine Environmental impacts of the deep-water oil and gas industry: A review to guide management strategies A multi-criteria decision approach to decommissioning of offshore oil and gas infrastructure Decommissioning of offshore oil and gas structures: Environmental opportunities and challenges Decommissioning of offshore oil and gas facilities: A comparative assessment of different scenarios Environmental benefits of leaving offshore infrastructure in the ocean OSPAR. OSPAR Inventory of Offshore Installations Finding Groups in Data Effects of the COVID-19 pandemic on psychological well-being and mental health based on a German online survey. Front Spanish otter trawl fisheries in the Cantabrian Sea Application of partitioning around medoids cluster for analysis of stunting in 100 priority regencies in Indonesia Multivariate random forest models of estuarine-associated fish and invertebrate communities The use of cluster analysis by partitioning around medoids (PAM) to examine the heterogeneity of patients with low back pain within subgroups of the treatment based classification system Comparison of two approaches to standardize catch-per-unit-effort for targeting behaviour in a multispecies hand-line fishery Distance-based clustering of mixed data Image-based cell subpopulation identification through automated cell tracking, principal component analysis, and partitioning around medoids clustering Integrating the scale of population processes into fisheries management, as illustrated in the sandeel, Ammodytes marinus A general coefficient of similarity and some of its properties R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing Offshore oil platforms and fouling communities in the southern Arabian Gulf Rapid assessment of fish communities on submerged oil and gas platform reefs using remotely operated vehicles Fish densities associated with structural elements of oil and gas platforms in southern California Zonation of dominant fouling organisms on northern Gulf of Mexico petroleum platforms Quantitative analysis of fish and invertebrate assemblage dynamics in association with a North Sea oil and gas installation complex An analysis of the sessile, structure-forming invertebrates living on California oil and gas platforms Species distribution modelling of marine benthos: A North Sea case study Diversity and community structure of epibenthic invertebrates and fish in the North Sea This work was conducted as part of the FISHSPAMMS project, part of the INSITE programme, funded by the UK Natural Environment Research Council under grant number NE/T010681/1. We would like to thank our project partners for their input, in particular Ross Nickson and Paul Shearer, for their expertise and advice about platform characteristics. P.G.F. suggested the study concept. J.M.L. gathered the data, selected the methods used, conducted the analysis and wrote the first draft of manuscript. P.G.F. commented on and edited the previous versions of the manuscript, figures and abstract. All authors have read and approved the final manuscript. The authors declare no competing interests. Correspondence and requests for materials should be addressed to J.M.L. Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.