key: cord-0543415-cglu26rw authors: Candela, Massimo; Luconi, Valerio; Vecchio, Alessio title: Impact of the COVID-19 pandemic on the Internet latency: a large-scale study date: 2020-05-13 journal: nan DOI: nan sha: f1b5baf0b87c2a937a033c7b44b5a22735bcd4fd doc_id: 543415 cord_uid: cglu26rw The COVID-19 pandemic dramatically changed the way of living of billions of people in a very short time frame. In this paper, we evaluate the impact on the Internet latency caused by the increased amount of human activities that are carried out on-line. The study focuses on Italy, which experienced significant restrictions imposed by local authorities, but results about Spain, France, Germany, Sweden, and the whole Europe are also included. The analysis of a large set of measurements shows that the impact on the network is significant, especially in terms of increased variability of latency. In Italy we observed that the standard deviation of the additional delay -- the additional time with respect to the minimum for a given path -- during lockdown is ~3-6 times as much as the value before the pandemic. Similarly, packet loss is ~1.4-7 times as much as before the pandemic. The impact is not negligible also for the other countries and for the whole Europe, but with different levels and distinct patterns. At the time of writing, the coronavirus disease (COVID- 19) pandemic is still ongoing and billions of people are under some form of lockdown. The restrictions faced by citizens are more ore less stringent, depending on the resolutions adopted by the different governments, but in many cases non-essential activities have been shut down and a large fraction of people is confined at their homes. Many activities that are normally carried out in physical presence are now taking place online. As a consequence, the amount of traffic on the Internet increased significantly during the last months. In this paper, we analyze the impact of the COVID-19 pandemic on the latency of the Internet. Latency is one of the major properties of the network and it is becoming everyday more important, as several Internet applications are particularly sensitive to its fluctuations. Examples include on-line videogames [1] , [2] , video calls, VOIP [3] , and IP geolocation [4] . We analyzed a large set of measurements, collected by means of the RIPE Atlas platform [5] , to better understand the effects on the network caused by this major change in the way we live. The analysis focuses on Italy which, in April 2020, has been under lockdown for more than a month, experiencing some of the strictest limitations enforced by authorities: all schools, universities, and non-essential shops are physically closed, and people are authorized to leave their homes only for undeferrable necessities. Distance learning and remote working were applied whenever possible, with a significant increase in usage of virtual meeting and video-conference applications [6] , [7] . Table I summarizes the most important events which could have had an impact on the Italian Internet latency. As can be noticed, limitations to citizens have been introduced progressively. For this reason, the changes caused by the Italian lockdown are studied, in the remaining of this paper, by comparing the situation of the network in the February 11-17 week with the March 10-16 week. The former represent the "normal" situation of the network, as it is antecedent to all restrictive measures; the latter, on the contrary, comes just after the most restrictive limitations. The period in between corresponds to a transitory phase, where partial lockdowns start to impact the network performance. Hereafter, we will use W1 to indicate the baseline week, and W2 for the week just after the major lockdown event. Beside Italy, we include a brief analysis also concerning Spain, France, Germany, Sweden, and the whole Europe. Spain, France, and Germany have been characterized by restrictions similar to the Italian ones. Sweden instead adopted a no-lockdown strategy. For Spain, France, and Germany W2 is shifted according to their major lockdown event (shown in Table I ). For the whole Europe the situation is more heterogeneous, as some countries where less hit by COVID-19 and thus adopted milder restrictions. For Sweden and Europe, W2 corresponds to the March 20-26 week, the last of our observation period which goes, overall, from February 11 through March 26. Results show that the impact is not always the same, across the considered countries and on a European scale. The contribution of this paper can be summarized as follows: • Some statistics have been recently released by Internet Service Providers (ISPs), and other players of the Internet ecosystem, about the increased amount of traffic they have been exposed to because of the COVID-19 pandemic. However, a picture that leaves aside the very specific points of view of the single operators is still missing. This study provides a more global view, not polarized by the single operator's perspective. In addition, most of the statistics released by operators concern the amount of traffic, with limited (or absence of) information about latency. • The amount of measurements analyzed is large, thus providing solid foundations for the included statistics. Moreover, besides the sheer number, we decompose the impact on delay according to the most relevant factors, including the time of the day, the type of target (belonging to a content delivery network or not), and the version of the Internet Protocol. • Besides Italy, results also concerning Spain, France, Germany, Sweden, and the whole Europe are included. Since measurements have been collected using a single platform, results obtained for the different countries can be compared without the possible bias introduced by the adoption of multiple and heterogeneous systems. The raw latency data used for this study was collected by RIPE Atlas [5] . We then filtered and enriched such data as detailed in the following subsections. RIPE Atlas is a globally distributed Internet measurement platform that produces more than 10 000 measurements per second [8] . Among the open platforms aimed at measuring the Internet, RIPE Atlas is the one with the largest number of vantage points [9] , and it has a massive presence in Europe. RIPE Atlas automatically carries out Anchoring Measurements (AMs), where the set of targets is pre-defined. In particular, a large set of devices called probes perform measurements towards other devices called anchors, every 15 minutes. Anchors are usually hosted in Internet eXchange Points (IXPs), in the operation centers of ISPs, or in data centers. Hence, they enable monitoring of the core infrastructure of the Internet. The results produced by AMs have been extensively used for both research (see, for instance, [10] , [11] , [12] , [13] , [14] ) and operational purposes (for example DNSMON [15] , a service aimed at monitoring the worldwide core DNS infrastructure). Additionally, RIPE Atlas allows its users to collect measurements towards arbitrary targets. Results of User-Defined Measurements (UDMs) are stored in a database from where they are accessible to the public (their access is not restricted to the experimenters who triggered them). RIPE Atlas, for its measurements, relies on classical network tools. The latency from an Atlas node to a target is estimated by means of the ping tool which, as known, makes use of ICMP echo requests and echo replies. For AMs, ping is launched to collect three Round Trip Time (RTT) values. For the UDMs, the default number of collected RTT values is again three, but this number can also be changed by the experimenter. Our analysis on the impact of COVID-19 pandemic relies on the results generated by both AMs and UDMs. Starting from the dates of the events reported in Table I , we considered all AMs comprised in the interval from the 11th of February to the 26th of March 2020. The set of probes and anchors involved in the measurements is stable, with little variations caused by the possible temporary unavailability of probes. Since AMs are performed periodically and for the entire period of study, they provide information on the Internet latency from a stable point of view. The position of source and target nodes is fundamental to analyze the impact on a country-level basis. For AMs, the position of both source and target nodes is well-known, as such information is provided for each node participating in the platform. We use such information to select a subset of the ping measurements having both source and target in Europe. From now on we will refer to such subset as the AM-derived dataset (AMD). AMD is composed of more than 6 billion RTT values generated by 244 603 source-target pairs during the monitored time period. Users of the RIPE Atlas platform can define their own latency measurements according to their needs and interests. They can select the set of targets to be probed, define the periodicity of probing, and set the time-span of their measurement activities. In UDMs, targets frequently include the servers of major Internet companies, DNSs, or privately owned network resources. As a consequence, the set of targets involved in UDMs is heterogeneous. Sources are always a subset of RIPE Atlas nodes, but the subset can be different from user to user. For our analysis we are interested in a subset of UDMs where measurement activities were possibly scheduled before the COVID-19 breakout in Europe and repeated periodically throughout the observation period. To obtain such subset, we adopted the following strategies. First, we extracted only periodic latency measurements, i.e. configured to be automatically repeated after a certain amount of time. Second, we discarded the measurements configured to collect less than three latency values per ping execution. Third, we further filtered the selection to contain only the measurements concerning source-target pairs that produced successful results in at least ten different days. This was done to eliminate measurements occurring in a concentrated time period. Fourth, we restricted measurements to the ones targeting IP addresses in Europe. It is important to notice that in UDMs only the location of the source nodes is well-known, as it is provided by RIPE Atlas. Hence, for this step, we estimated the position of the targets by using RIPE IPmap [16] . RIPE IPmap uses active geolocation, and it has been reported to be 100% accurate at continent level and 99.58% at country level [17] . Additionally we used MaxMind GeoLite2 [18] as a fallback tool, in case of failed IP geolocation with RIPE IPmap. Measurements where the target was not successfully geolocated using these two tools were discarded. Finally, in some cases, we had to limit the amount of extracted information due to the almost unmanageable volume of data in the repository. In particular, when the number of targets in the geographic area of interest was too large, we randomly selected 10 000 targets and the analysis was restricted to them. Even when we had to limit Germany is put on lockdown the number of targets, the number of sources from which measurements were started was not subject to any limitation. From now on we will refer to the dataset built according to the above-described procedure as the UDM-derived dataset (UDMD). UDMD is composed of 596 million RTT values generated by 561 755 source-target pairs during the observation time period. Let d be the delay of a given Internet path. It can be roughly expressed as d = d t + d p + d q , where d t is the transmission delay, d p is the propagation delay, and d q is the time spent in queues and because of processing at intermediate routers. In a wide-area scenario like the one considered, d p amounts to a significant fraction of the overall delay, as signals can travel at most ∼ 200 km/ms in fiber (approximately 2/3c). In addition, under the assumption that the geographical properties of a path do not change significantly, the only component that is going to be affected by increased traffic is d q . To isolate d q from the other components, we estimated d t + d p for each couple of endpoints as the minimum delay observed in all measurements for that couple. The observation period has a duration of approximately six weeks, with measurements collected at different times of the day including nighttime, when the network is presumably lightly loaded. Thus, we can reasonably assume that the minimum observed value provides a good estimate of the minimum latency associated with a given path, i.e. the one occurring in the absence of cross-traffic at intermediate routers. More in detail, each source node, say a, measures the Round Trip Time (RTT) towards a target, say b, using the ping command at time t, which produces a set of delay values as the minimum, average, and maximum value found in the execution of the ping command at time t from a to b. First, we found the minimum value observed for each source-target pair as where O is the period of observation. Then, as mentioned, we used m as a baseline to estimate the additional time experienced for every single pair of nodes. In particular, we computed The values of q min,t , q avg,t , and q max,t were grouped in buckets with a duration of 30 minutes and averaged. Let us call r min,k , r avg,k , and r max,k the average values obtained in the kth bucket T k : with t ∈ T k where ab is the source-target pair. We studied the impact of the COVID-19 pandemic on the latency of the Italian Internet from different perspectives: when both source and target are located in Italy or just one of the two, when considering the time of the day, and when taking into account the version of the Internet Protocol. We also studied the observed latency when the target is part of a content provider network. We start our analysis from the measurements in AMD with both source and target in Italy. Figure 1a shows the evolution of r min , r avg , and r max (without distinction between IPv6 and IPv4) for the whole observation period. It can be noticed that all values progressively increase over time. Since Italy experienced some partial lockdown events before the most restrictive one, there is no step-like increase, rather a continuous one. However, approximately from February 23, when schools were closed in Northern Italy, delays start to grow and higher peaks can be observed. The largest increase can be observed on March 10, which is the date of the complete Italian lockdown. Similar patterns, but more accentuated, can be seen for UDMD, shown in Figure 1b . In this case, the increase of RTTs can be seen clearly starting from February 22, when Besides the generally increasing value of all the three curves corresponding to the three r variations, the higher variability of latency is also evident. This can be observed in two ways: i) the distance between the blue curve and the red curve, which represents the difference between the minimum and maximum RTT value in a ping execution, tends to become larger; ii) the fluctuations of each single curve are much more evident. The increased variation is also visible in Figure 2 , which compares the value of r min in W1 and W2. The average r min in W2 is 30.2% higher with respect to W1. Variability of r min is characterized by a much larger increase, with a standard deviation that is approximately 2.8 times as much as the one in W1. We also estimated the impact of lockdown in terms of packet loss, as the fraction of unsuccessful echo request/reply. Also for this metric, the increase is significant: from 1.1e-3 to 1.5e-3 (+41.5%) for AMD, and from 1.6e-3 to 1.1e-2 (+562.9%) for UDMD. Similar considerations can be made about measurements with sources in Italy and targets spread all over Europe (excluding Italy), as shown in Figures 3a and 3b for AMD and UDMD respectively. Also in this case, the generally increased variability during lockdown is clearly visible. Instead, when considering measurements with sources in Europe and targets in Italy, as shown in Figures 3c and 3d , the impact of the Italian lockdown is slightly less noticeable. A possible explanation is that, for Italy, the increased latency during lockdown is mostly localized in access networks. For the step-like variation occurring approximately on February 29 observable in Figure 3d , we registered a considerable increase of packet loss (which reached values greater than 3% when it usually was 0.1%) but we were not able to associate such temporary increase with any specific event in Europe. The three curves shown in Figures 1 and 3 show repeated peaks and troughs. This is particularly evident during lockdown, which suggests that latency gets more influenced by circadian rhythms. To evaluate the influence of the time of the day on latency, we divided Italian measurements in one-hour slots and aggregated them when executed in the same hour. Then, we calculated for each slot the ratio between the average r values collected during W1 and during W2. Results are represented in Figure 4a for AMD. It is clearly visible how the increase is not uniform across the time of the day. Night hours show no considerable increase. This is not surprising, as human activity is very limited at night, thus also the congestion of the Internet. Morning hours show some little increase. Afternoon hours show a more evident increase, but the highest increase occurs between 16:00 UTC and 22:00 UTC, with a peak between 20:00 and 21:00 UTC. This is interesting, as it suggests that remote working and distance learning have some impact on the Italian Internet latency. However, the major effects can be attributed to leisure activities which typically occur in the afternoon/evening, such as gaming or video streaming (the reader has to keep in mind Italy is UTC+1 in the analyzed period). This could be due to the lack of other recreational activities during the lockdown. It is also worth to notice that the local minima of the r min curve tend to get higher during the the transitory period but then they start to get lower. This is particularly visible in Figure 1b . The local minima (the troughs) correspond to night hours, when the network is lightly loaded. This phenomenon could be explained by the infrastructural enhancements introduced by network operators to respond to the crisis. For example, during the transition period, TIM (the Italian incumbent) started peering again in public peering LANs of Italian IXPs, which it hasn't done since end of 2012 [19] . Also, IXPs reported an increase of traffic of 30 -40%, which pushed them to introduce upgrades in the capacity of their peering LANs, as reported during the Italian Network Community Meeting held for the occasion [20] . Additionally, such improvements could justify the situation in Figures 3a and 3c , where the troughs during the lockdowns reach smaller latency values compared to the one of the first analyzed week. To better see this, we calculated the ratio between W1 and W2 for the two cases depicted in Figures 3a and 3c . The results are shown in Figures 4b and 4c , respectively. In both cases the ratio goes slightly below one during night and morning hours. Especially surprising is Figure 4c where the the ratio goes below one for the entire morning. This could happen as in Figure 4c we consider sources outside Italy, thus the access network is still not involved in a lockdown phase, and targets in the Italian infrastructure, which as mentioned has been improved to cope with the traffic increase. During night and morning hours the load on the network is still light, so in this particular configuration the performance could increase. To conclude, in evening hours the increased Internet usage during lockdown generally produces larger delays, but in periods of lighter load the network is sometimes more efficient than before the lockdown. We further analyzed AMD taking into account the version of the IP protocol. AMD contains 24 million IPv4 RTT values and 8.5 million IPv6 RTT values with both source and target in Italy. Figures 5a and 5b shows the three r variations for IPv4 and IPv6, respectively. IPv6, in Italy, seems to be characterized by larger variability compared to IPv4, independently from the 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 Hour of the day 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 Hour of the day lockdown period. Such variability increases even more during the lockdown period. For IPv6 the reduction of minimum r min values, probably due to the network improvements introduced by operators, is more visible. We further studied the latency experienced between 20:00 UTC and 21:00 UTC (the peak hour previously identified) by the two protocol versions. In particular, we compared the values collected during W1 with the values during W2. For IPv4, values of r in W2 are on average 2 times the values in W1. For IPv6 instead the increase is more modest, 1.2 times on average To conclude, IPv6 in Italy is characterized by a generally higher variability than IPv4, but in peak hours the former has been impacted less than the latter by stay-at-home orders. This could be due to the minor availability of domestic IPv6 connectivity. The same analysis was not repeated using UDMD, as the relatively limited amount of IPv6-based UDMs does not allow us produce statistically sound results. Since the vast majority of traffic is nowadays directed towards content providers, which we suspect also being target of most of the evening traffic (e.g. video entertainment), we investigated the impact of the lockdown on the latencies towards YouTube. We are not only interested in the latencies experienced to reach the YouTube website, but also in the latencies towards the Content Delivery Networks (CDNs) used in the background. It must be noticed that our initial purpose was to collect measurements towards Facebook and Netflix as well. However, we did not find enough measurements to cover the whole observation interval, and to obtain statistically significant results. To obtain measurements toward YouTube, we first mapped the names associated to the ad-hoc YouTube CDNs (googlevideos.com and ggpht.com) to the IP addresses that are used to serve content in Italy. For this purpose, we used all the RIPE Atlas probes in Italy to run DNS queries to obtain the IP addresses associated to the YouTube domains. We then checked these IP addresses to be sure they are not anycast and that are located in Italy. This last step was carried out using RIPE IPmap. Finally, we used these IP addresses to extract RIPE Atlas measurements targeting them. We found measurements related to 15 866 different source-target pairs, which produced 437 million IPv4 and 270 million IPv6 RTT values towards YouTube and its CDN. Figure 6 shows the obtained three r variations. An increase of the overall latency and its variability is visible during the days of the lockdown, for both IPv4 (Figure 6a ) and IPv6 (Figure 6b ). The standard deviation of r min approximately doubles from W1 to W1: 2.1 times and 1.8 times for IPv4 and IPv6 respectively. Our hypothesis that the increase of RTT registered in the evening hours is due to people forced to stay home and using Internet for entertainment, is strengthened by the results of the measurements towards YouTube. Figure 7a shows ping measurements collected between 05:00 UTC and 06:00 UTC while Figure 7b shows ping measurements collected between 20:00 UTC and 21:00 UTC, for the entire observation period. Also in this case, the r min during night hours slightly improves during the lockdown. The r min in the evening gets moderately higher during the transitory period and abruptly increases after the first day of complete lockdown. In this section, we show how the latency in other European countries was affected by the COVID-19 pandemic. We considered Spain, France, and Germany, which had their major lockdown events respectively five, seven, and twelve days after Italy. In addition, we considered Sweden, as an example of a country that adopted a no-lockdown policy. Finally, we considered all European countries together. Overall results are depicted in Figure 8 . Table II reports the increment of the average and standard deviation of r min after stay-at-home orders. To obtain the increment, for each country we compared the first week of measurements (W1) and the first week of lockdown (W2). For Germany, the considered periods are fourday long instead of seven: Germany was put on lockdown at the end of our observation period and a full week was not covered by the collected data. Since there is not a unique date for lockdown in the whole Europe, and since some countries did not even enter a lockdown phase, for Sweden and Europe we compared the first and the last week of the observation period. The two weeks used for comparison are highlighted in light grey in Figure 8 . In the following we analyze the considered countries in detail. Spain shows high variability of latencies also before the lockdown. However, an increase due to enforced restrictions is noticeable in AMD. Similarly, circadian patterns become more evident (Figure 8a ). This is confirmed also by the summary statistics, which shows an increment of 41.4% of the r min average, and of 21.1% of the r min standard deviation, in W2 compared to W1. The visual analysis of UDMD shows a progressive increase of latency which starts much earlier than the lockdown, and a transient instability around March 17 ( Figure 8b ). However, results in Table II show an increment similar to the one observed in AMD: 51.4% for the average and 21.0% for the standard deviation. On March 7, a temporary increase can be observed in Figure 8b . We investigated this phenomenon and found that it is due to a considerable increase in the latency towards one target, observed from multiple sources. We believe this to be an anomalous behavior due to the target itself or the network in its proximity. In France, the situation is different, as shown in Figures 8c and 8d, which show AMD-and UDMD-based results respectively. In AMD, the overall increase of latency is barely noticeable. Lockdown seems to accentuate the periodic fluctuations due to circadian rhythms. In fact, the summary statistics included in Table II show even a decrement of r min average and standard deviation in W2 compared to W1, in the AMD. However, it must be noticed that the first week of measurements in AMD appears as particularly noisy if compared to the other pre-lockdown weeks. UDMD instead show a much higher variability and an evident impact due to lockdown. The analysis shows an increment of 20.3% and 71.0% respectively for the average and standard deviation of r min . In Germany the situation is different from both Spain and France. First of all, it is worth to notice that Germany shows the lowest variability in measurements, which could indicate that the Internet infrastructure is generally more stable. In AMD, the effect of lockdown is noticeable only in terms of amplified circadian patterns (Figure 8e ). The analysis confirms this, by showing just a 5.6% increase of the r min average, and a 37.0% increase of the r min standard deviation. In UDMD, a more significant increase of latency is visible (Figure 8f) , which corresponds to a 20.5% higher value for the r min average in W2 compared to W1. It is worth to notice that in Germany the degradation of the latencies starts more than one week before the lockdown. This happened because Germany, like Italy, proceeded to some partial lockdowns and school closures before the major restrictions. As mentioned above, Sweden did not put the entire country on lockdown. For this reason, results shown in Figures 8g and 8h are definitely interesting. Both AMD and UDMD show a progressive increase of r and its variability in the considered time period. This can mean either that Swedish people autonomously increased social distancing and implemented stay-at-home policies, or that the performance of the Swedish Internet infrastructure has been affected by the effects of the lockdowns imposed in other countries. Also the comparison of W1 and W2 show a significant increase of r min average and standard deviation: 77.3% and 293.4% for AMD and 45.7% and 160.6% for UDMD. Figure 8i and Figure 8j show respectively the AMD-and UDMD-based results for all Europe. Additional latency is generally smaller than the individual countries we analyzed. The response to national lockdowns seems to be fairly good even if it can be noticed an accentuation of the variability due to the circadian activities, starting from the Italian lockdown, but not a significant increase of the overall latencies. This is confirmed by the statistics reported in Table II : the r min average is subject to a modest increase, equal to 13.6% and 9.1% in AMD and UDMD respectively, while the r min standard deviation experiences a significant increase, equal to 183.2% and 50.8%. These results seem to indicate that, on a continent-level scale, the impact of lockdown is still noticeable but without dramatic changes in observed performance. It is well-known that computer viruses may cause a slowdown of the Internet [21] , [22] . In 2020 we all learned that also biological viruses may affect the global Internet performance, because of the changes they bring in the way we live. In this paper, we analyzed the impact of the COVID-19 pandemic on the latency of the Internet at a large scale. Latency is particularly important not only because it has a profound effect on some classes of applications, but also because it is, by itself, an excellent indicator of the health status of the network. Results, which have been obtained from the analysis of a large amount of measurements (the European AMD-based results rely on more than 6 billion latency values), show that the impact of the increased on-line activities is relevant, especially in terms of higher jitter. The major changes have been observed in the evening, the time of the day when most of the on-line activities are related to entertainment. This suggests that distance learning and remote working contributed to a lesser extent in terms of additional network latency. Results obtained for the considered countries show relevant differences, which can be due to the resilience levels of their network and/or to the non-uniform restrictions imposed by authorities. Overall, the COVID-19 pandemic impacted negatively on the latency of the Internet, in particular in terms of increased variability, because of the higher demand of network resources associated to the mutated style of life. This phenomenon has been quantitatively evaluated for the first time in this paper on a country-level and continent-level scale, using one of the largest dataset of measurements currently available. We believe that the provided numbers and the related analysis, despite being limited to a portion of the Internet, definitely shed light on this previously unseen event in the history of the Internet. (j) UDMD, from EU to EU. Fig. 8 : r in Spain, France, Germany, Sweden and whole Europe. The dashed vertical lines correspond to lockdown events for the considered region. The gray areas correspond to W1 and W2 for the considered region. Latency and player actions in online games The effects of latency on player performance in cloud-based games Mouth-to-ear latency in popular voip clients Using ripe atlas for geolocating ip infrastructure Ripe atlas: A global internet measurement network Our commitment to customers during covid-19 Zoom CFO explains how the company is grappling with increased demand A survey on Internet performance measurement platforms and related standardization efforts Vantage point selection for ipv6 measurements: Benefits and limitations of RIPE Atlas tags Pinpointing delay and forwarding anomalies using large-scale traceroute measurements Disco: Fast, good, and cheap outage detection Joint minimization of monitoring cost and delay in overlay networks: optimal policies with a markovian approach Systems for characterizing Internet routing Visualization and monitoring for the identification and analysis of DNS issues Tracing cross border web tracking IP Geolocation and Online Fraud Prevention AIIP -presenza TIM negli IXP bene per il paese Meeting summary The internet worm program: An analysis Worm epidemics in high-speed networks