key: cord-0185665-xm5gekg4 authors: Das, Arunav title: Visual Analytics approach for finding spatiotemporal patterns from COVID19 date: 2021-01-16 journal: nan DOI: nan sha: 6addb501413ff9cc33a95cacbd2395bad2465c9b doc_id: 185665 cord_uid: xm5gekg4 Bounce Back Loan is amongst a number of UK business financial support schemes launched by UK Government in 2020 amidst pandemic lockdown. Through these schemes, struggling businesses are provided financial support to weather economic slowdown from pandemic lockdown. {pounds}43.5bn loan value has been provided as of 17th Dec2020. However, with no major checks for granting these loans and looming prospect of loan losses from write-offs from failed businesses and fraud, this paper theorizes prospect of applying spatiotemporal modelling technique to explore if geospatial patterns and temporal analysis could aid design of loan grant criteria for schemes. Application of Clustering and Visual Analytics framework to business demographics, survival rate and Sector concentration shows Inner and Outer London spatial patterns which historic business failures and reversal of the patterns under COVID-19 implying sector influence on spatial clusters. Combination of unsupervised clustering technique with multinomial logistic regression modelling on research datasets complimented by additional datasets on other support schemes, business structure and financial crime, is recommended for modelling business vulnerability to certain types of financial market or economic condition. The limitations of clustering technique for high dimensional is discussed along with relevance of an applicable model for continuing the research through next steps. Could application of Visual Analytics have led to a different economic outcome from UK's Bounce Back Loan Scheme (BBLS)? BBLS was launched in April 2020 [1] for supporting Small, Medium sized Enterprises (SMEs) impacted by Coronavirus pandemic lockdown. Until scheme expiry [2] , UK businesses from any location/sector could apply for loan of £2,000-£50,000 (maximum 25% business' turnover) from Accredited Lenders [3] . £43.5 billion [4] is estimated to have already been granted from which £15-£26 billion loan losses [5] are estimated by National Audit Office due to failure of businesses to pay back or fraudulent loan applications. Aside from business owner self-certification, traditional credit checks were bypassed for processing BBLS applications. Whilst bypassing traditional checks could have been necessary to expedite support for businesses facing challenges, potential losses from future write-offs, frauds might have been mitigated through an analytical decision making involving historical business failure rates, sector concentration by reviewing pattens for • What % of businesses fail every year i.e., would some businesses fail even without COVID19? • Does business death rate change by UK Region, Sector, Time? • What Regions, Sectors are more/less impacted by lockdown and require more/less support from BBLS Information needed for this research is available under -ONS 1 historical spatiotemporal datasets [12] with UK business demographics (Birth, Death, Active -Location, Sector); spatial dataset [12, 13] with Sectorial information by location; BBB 2 spatiotemporal dataset for BBLS [14] ; ONS 1 business sentiment [15] survey. Visual Analytics applied to Baseline and Comparative datasets is expected to provide insights into the efficacy of BBLS Whilst research on role of visual analytics for governmentbacked financing schemes nor business lending assessments by banks is available, there is evidence for application of visual analytics on spatiotemporal datasets of mortgages, network-guaranteed loans, government bonds to aid lending and risk management decisioning. Heinen et al [6] have modelled mortgage defaults over geographic distances in Los Angeles using spatiotemporal dataset from 2000-2011 for demonstrating geospatial dependence between loan defaults and zip codes. Their approach involves grouping mortgages into risk categories of prepayment vs default by applying multinomial logistic regression model followed by Copulas method linking mortgage defaults to zip codes using Matern correlation function. Timeseries analysis helped decisioning for splitting dataset into pre-/post-2005 for considering higher default rates post 2005 due from introduction of new mortgage products. Their model shows high dependency of defaults within 40 km geographic range due to shared socioeconomic factors. Niu et al. [7] proposed new way of managing financial systemic risks from business borrowing needs through a network-guarantee framework and segregation of highdefault groups. Their approach involves multi-faceted risk visualization interface linking default risk prediction made through gradient boosting tree over the artificial spatial features of the business network using network centrality measures across 400+ nodes and 103 enterprises. Szulc et al. [8] et al used spatial correlation to assess impact of credit rating changes on government bonds on the yield on government bonds of neighboring countries and found a statistically significant negative correlation using dynamic spatial panel models on grouped timeseries and crosssectional data of ten-year government bonds and economic factors (e.g., inflation, volatility, rating increase/decrease) across 40 countries for 2008-2017 timeseries. Traditionally, the yield on government bonds have only been evaluated through credit ratings. Their research shows a geospatial correlation of government bond yields More generally, Adrienko(s) [9] discuss issues with standalone computational methods and interactive visualization for modelling real-world spatiotemporal datasets. They propose a new framework for modelling such datasets through a combination of clustering multiple spatial timeseries and interactive grouping of geographic objects and spaces. The proposed framework includes use of animated cartographic displays, time graphs, interactive tools for cluster analysis and statistical methods for modelling timeseries data applied on transformed, grouped and filtered spatiotemporal datasets. Yao [10] addresses similar issues by proposing an analysis framework that involves slicing spatiotemporal datasets into Spatialization and Temporalization categories for segmentation, dependency analysis, outlier detection and trend discovery. Leveraging the specific risk management frameworks and generic visualization techniques for spatiotemporal datasets from literature review, this paper applies Visual Analytics on Spatialization and Temporalization framework to analyze timeseries of business demography, Sector concentration alongside spatial distances/transformation of parliamentary constituencies to compare against the spatial distribution and size of BBLS in UK for evaluation of applied Visual Analytics to propose amendment to government BBLS Scheme policy/bank's lending criteria with potential benefits linked to lower write-off probability from bad loans. Similar Policy review aspect of application of Visual Analytics on spatiotemporal data has been presented by Wood et al [2010] using socio-economic datasets from Leicestershire. Information published by ONS [12] for annualized number of Enterprise Births, Deaths, Survivals from 2014-2019 has been used as Baseline Dataset. Geographically, this information is split by UK District, Counties, Regions and by Industries. This provides insight on sectoral and regional composition and concentration of enterprises listed withannual number of new, failed, active businesses. London Borough's business demographic dataset published by ONS [13] has been used for detailed analysis of Greater London region over an extended period of 2004-2018 Information about UK Government's pandemic support initiatives for businesses is available from Commons Library [14] . This report contains point in time, 17 th December 2020, information about total number of loan applications and value of loans (£) granted since its inception under BBLS, split-by Parliamentary Constituencies. This information has been used as Comparative Dataset for evaluating efficacy of BBLS through by comparing pre-pandemic regional and sectoral business demise with post-pandemic regional/sectoral BBLS Loan distribution to identify spatial distribution of regions/sectors with historic high/low business deaths vs BBLS loan distribution London Boroughs saw higher rate of business deaths. 2017 was chosen due to its temporal proximity to 2020 to plot TreeMap for visualizing correlations between the size of 2017 business death per Borough in 2017, anomaly evaluation comparing it with the timeseries growth rate in 2016 and subsequent recovery by 2019. This aids in reviewing the temporal variation of business death rates by Boroughs and its correlation with BBLS applications. Insights from Sectoral and Sentiment timeseries was supplemented with London sectoral concentration analysis [Fig9/Appendix6] through 17-Sector histograms, color encoded by cluster labels, and marked as 'High(red)-Medium(amber)-Low(green)' to identify Boroughs with high concentration and high vulnerability from COVID19 business impact. Features of geospatial clusters was analyzed through these two timeseries outputs to understand correlation between historic death rates, sector concentration with spatial BBLS clusters. Geospatial cluster analysis shows businesses in Outer London (OL) have been disproportionately impacted by COVID19 lockdown. Until last year 9 OL Boroughs, highlighted in A Multinomial and Spline Regression model [22] could be built from analysis of cluster features for government grants decision making. Such models are used for Credit Risk modelling in Retail Bank. The model could be trained on London datasets and tested on the Manchester and Birmingham spatiotemporal datasets that represents similar level of business activities and BBLS applications. As the BBLS scheme date has been extended till Jan2021, also depending on pace of economic recovery post pandemic, this proposed model could help in designing future financing schemes for supporting businesses based on spatial patterns, temporal trends, sectoral concentrations, business structures & employees, consumer sentiments and financial crime datasets. Bounce Back Loan scheme: what does it mean for lenders? | Grant Thornton Back Loan and Future Fund Schemes | Coronavirus: adapting to change | RSM UK Coronavirus: Problems with Bounce Back Loans (parliament.uk) British Business Bank support schemes deliver £68bn of loans to smaller businesses -British Business Bank (british-businessbank Investigation into the Bounce Back Loan Scheme -National Audit Office Spatial Dependence in Subprime Mortgage Defaults Visual Analytics for Networked-Guarantee Loans Risk Management Spatio-Temporal Analysis of the Impact of Credit Rating Agency Announcements on the Government Bond Yield in the World in the Period of A visual analytics framework for spatio-temporal analysis and modelling Research Issues in Spatio-temporal Data Mining vizLib: Using The Seven Stages of Visualization to Explore Population Trends and Processes in Local Authority Research UK -Office for National Statistics (ons.gov.uk) Coronavirus business support schemes: statistics -House of Commons Library (parliament.uk) Business Impact of COVID-19 Survey (BICS) results -Office for National Statistics (ons.gov.uk) Raul Zurita-Milla & Changqing Song (2020) An overview of clustering methods for geo-referenced time series: from one-way clustering to co-and triclustering Exploratory analysis of spatial and temporal data -a systematic approach Interactive visual clustering of large collections of trajectories Space-in-time and time-in-space self-organizing maps for exploring spatiotemporal patterns A hybridized K-means clustering approach for high dimensional dataset Integration K-Means Clustering Method and Elbow Method For Identification of The Best Customer Profile Cluster Multinomial Logistic Regression and Spline, Regression for Credit Risk Modelling Analysis of complex, multidimensional datasets Clustering Optimization Using EigenVector Principal Component Analysis UK -Office for National Statistics (ons.gov.uk), UK Baseline Dataset [13] Business Demographics and Survival Rates Comparative Dataset [15] Business Impact of COVID-19 Survey (BICS) results -Office for National Statistics (ons.gov.uk), Business Sentiment Dataset Original Value Mapped Value Belfast East 54 Lagan Valley 54 Mid Ulster 54 North Down 54 Strangford 54 Upper Bann 54