key: cord-0478686-eqgyg1ez authors: Diebold, Francis X. title: Real-Time Real Economic Activity: Entering and Exiting the Pandemic Recession of 2020 date: 2020-06-26 journal: nan DOI: nan sha: bb9796b6248620667b68cf555fb8163bd3656a9b doc_id: 478686 cord_uid: eqgyg1ez Entering and exiting the Pandemic Recession, I study the high-frequency real-activity signals provided by a leading nowcast, the ADS Index of Business Conditions produced and released in real time by the Federal Reserve Bank of Philadelphia. I track the evolution of real-time vintage beliefs and compare them to a later-vintage chronology. Real-time ADS plunges and then swings as its underlying economic indicators swing, but the ADS paths quickly converge to indicate a return to brisk positive growth by mid-May. I show, moreover, that the daily real activity path was highly correlated with the daily COVID-19 cases. Finally, I provide a comparative assessment of the real-time ADS signals provided when exiting the Great Recession. Accurate assessment of of current real economic activity ("business conditions") is key for successful decision making in business, finance, and policy. It is difficult, however, to track business conditions in real time, both because no single observed economic indicator is "business conditions", and because different indicators are available at different observational frequencies, and with different release delays. Nevertheless there exists the tantalizing possibility of accurate real-time business conditions assessment ("nowcasting"), and recent decades have witnessed great interest in nowcasting methods and applications (e.g., Banbura et al. (2011) ). The workhorse nowcasting approaches involve dynamic factor models, which relate a set of observed real activity indicators to a single underlying latent real activity factor. Both "small data" approaches (e.g., based on 5 indicators) and "Big Data" approaches (e.g., based on 500 indicators) are available. Small data approaches were developed first, and they typically involve maximum likelihood estimation (e.g., Stock and Watson (1989) ). Subsequent Big Data approaches, in contrast, typically involve two-step estimation based on a first-step extraction of principal components (e.g., Stock and Watson (2002) , McCracken and Ng (2016) ). Both introspection and experience reveal that Big Data nowcasting approaches are not necessarily better. First, they are more tedious to manage, and less transparent. Second, they may not deliver much improvement in factor extraction accuracy, which increases and stabilizes quickly as the number of indicators increases (Doz et al., 2012) . Third, casual inclusion of many indicators can be problematic because a poorly-balanced set of indicators can create distortions in the extracted factor (Boivin and Ng, 2006) , whereas small data approaches promote and facilitate hard thinking about a well-balanced set of indicators (Bai and Ng (2008) ). Against this background, in this paper I assess the performance of a leading small-data nowcast, the Aruoba-Diebold-Scotti (ADS) Index of Business Conditions (Aruoba et al., 2009) . ADS is designed to track real business conditions at high frequency, and it has been maintained and released in real time by the Federal Reserve Bank of Philadelphia continuously since 2008. 1 Its modeling style and underlying economic indicators build on classic early work in the tradition of Burns and Mitchell (1946) , Sargent and Sims (1977) , and Stock and Watson (1989) . The underlying indicators span high-and low-frequency information on real economic flows: weekly initial jobless claims; monthly payroll employment growth, industrial production growth, personal income less transfer payments growth, manufacturing and trade sales growth; and quarterly real GDP growth. Crucially, I assess ADS using only information actually available in real time. This is required for truly credible real-time evaluation, and it can only be achieved by using nowcasts produced and permanently recorded in real time, which is very different from simply removing final-revised data and inserting vintage data into an otherwise ex post analysis. Unfortunately, such evaluations are rare, because there simply are not many instances of long series of nowcasts produced and recorded in real time. ADS, however, has been produced and recorded in real time roughly twice weekly since late 2008, so I can provide real-time performance assessments both exiting the Great Recession and entering/exiting the Pandemic Recession. Ultimately the paper takes a two-pronged approach. The first is the above-sketched attempt at real-time ADS assessment, asking whether ADS sends reliable signals. The second conditions on reliability of the signals, and uses them to assess what actually happened in the Pandemic Recession of 2020 (and, for comparison, in the Great Recession of 2007-2009). The two prongs are ultimately inseparable and woven together in various ways throughout the paper. I proceed as follows. In section 2, I provide background on aspects of ADS construction, updating, ex post characteristics, and performance evaluation. In section 3, I examine ADS entering/exiting the Pandemic Recession, and I relate the real-time ADS path to the realtime COVID-19 path. In section 4, I provide a comparative examination of ADS exiting the Great Recession. I conclude in section 5. 2 Nowcast Construction, Characteristics, and Assessment Here I provide background on the ADS index construction (section 2.1), ex post historical characteristics (section 2.2), and general issues of relevance to assessing ex ante nowcasting performance (section 2.3). ADS is a dynamic factor model with multiple mixed-frequency real activity indicators driven by a single latent real activity factor. The ADS index is an estimate of that latent real activity factor. Importantly, the model is specified such that the real activity factor tracks the demeaned growth rate of real activity. Progressively more negative or positive values indicate progressively worse-or better-than-average real growth, respectively. Because ADS tracks real activity growth, not level, a positive value does not necessarily mean "good times"; rather, it means "good growth", which may be from a level well below trend, as for example in the early stages of a recovery. ADS is specified at daily frequency, allowing as necessary for missing data for the lessfrequently observed variables. 2 Importantly, despite complications from missing data, timevarying system matrices, aggregation across frequencies, etc., the Kalman filter and associated Gaussian pseudo likelihood evaluation via prediction-error decomposition remain valid, subject to some well-known modifications. 3 Model estimation is therefore straightforward, after which the Kalman smoother produces an optimal extraction of the underlying real activity factor. That is, the Kalman smoother produces the index: The extracted sequence at any time t * is the vintage-t * ADS sequence, {ADS 1 , ADS 2 , ...ADS t * }. The first ADS vintage was released 12/5/2008, covering 3/1/1960 through 11/30/2008. Since then, ADS has been continuously updated whenever new data are released. The Kalman smoother is re-run, generally within two hours of the release, and the newlyextracted index from 3/1/1960 to "the present" is re-written to the web. ADS has been updated approximately eight times per month on average since inception. In Figure 1 I show the ADS index from 03/01/1960 through 12/31/2013, as assessed in the 6/26/2020 vintage. The sample range is well before the vintage pull date, so the chronology displayed is (intentionally) ex post. I do this because it is instructive to examine the ex post chronology before passing to real time assessment, which can only be done after ADS went Several features are noteworthy. For example, the ADS chronology coheres strongly with the NBER chronology, plunging during NBER recessions. In addition, several often-discussed features of the business-cycle are evident in ADS, such as the pronounced moderation in volatility during the Greenspan era. The ADS value added relative to the NBER chronology stems from the facts that (1) it is a cardinal measure, allowing one to assess not only recession durations, but also depths and patterns (see Table 1 ), and (2) its updates arrive in timely fashion, whereas the starting and ending dates NBER recessions are typically not announced until well after the fact (again see Table 1 ). Of course, if ADS is to be a useful guide for business and policy decisions, its frequently-arriving updates must provide reliable signals in real time, not just ex post as in Figure 1 . I now turn to that issue. Truly credible nowcasting performance assessment requires using vintage information, which emerges as the limit of a sequence of progressively more realistic and credible nowcast/forecast Recession severity S is the product of depth and duration. Both D and S use a late-vintage ADS chronology and the NBER recession chronology. evaluation approaches: 5 (1) Use full-sample estimation, and use final revised data (2) Use expanding-sample estimation, and use final revised data (3) Use expanding-sample estimation, and use vintage data ("Pseudo Real Time") (4) Use expanding-sample estimation, and use vintage information ("Real Time"). Approaches (1) and (2) are clearly unsatisfactory: Approach (1) uses time periods and data values not available in real time, and approach (2) is an improvement but still uses data values not available in real time. Approach (3), involving vintage data, is typically viewed as the gold standard. It is implemented comparatively infrequently, however, due to the tedium involved and the fact that vintage data are often unavailable. 6 Approach (4), involving vintage information, limits the information set to that available and actually used in real time, which is more restrictive than merely limiting the data to that available in real time. It is, however, almost never implemented. To appreciate why fully-credible assessment requires vintage information rather than just vintage data, consider the following: (1) Econometric/statistical theory and experience evolve, prompting changes to the estimation procedure; the frequency and timing of re-estimation and its interaction with benchmark revisions; the estimation sample period; allowance for parameter variation and breaks; the treatment of outliers; the strength of regularization employed; the predictive loss function employed; etc. (2) Economic theory and empirical economic experience evolve. Over time this may prompt, for example, the removal or re-weighting of some component nowcast indicators and/or addition of others (e.g., Diebold and Rudebusch (1991) ), as well as deeper changes in the nowcasting model. (3) Exact times and reliability of nowcast/forecast calculation and release may differ due to technological problems; outright mistakes in nowcast/forecast construction; evolving or changing software algorithms and associated bugs; parallel problems at the agencies responsible for the underlying data and decisions regarding how to deal with them in forecast/nowcast construction; etc. For these and other reasons, just as truly credible evaluation requires refraining from endowing agents with better data than were actually available in real time, so too does it require refraining from endowing them with better economic or statistical models and related tools than were actually available in real time, better judgment and decision-making abilities/choices than were actually manifest in real time, etc. The upshot is clear: Truly credible real-time evaluation -that is, evaluation using vintage information rather than just vintage data -can only be obtained by using nowcasts produced and permanently recorded in real time. ADS has been produced and recorded in real time since late 2008, so I can credibly study the key episode of current interest, the Pandemic Recession. I now proceed to do so. Figure 2 reveals the jaw-dropping ADS drop in the Pandemic Recession, more than five times that of any other recession since 1960. The ADS drop is entirely appropriate, due to similarly jaw-dropping and historically unprecedented movements in its underlying indicators. 8 The official NBER trough month for the Pandemic Recession is April 2020, making the Pandemic Recession the shortest in history. 9 In Figure 3 , I show the later-vintage Pandemic Recession path. The overall extracted path is smooth and convex, with a minimum in early April, and a return to positive growth by early May. 10 Note therefore that ADS would date the Pandemic Recession's end as May rather than the NBER's April. I stand by ADS, but the timing difference is of course negligible. The key difference is that ADS chronologies 7 I refer to an ADS extraction as a path. 8 See Appendix A for an annotated chronology of data releases and associated ADS movements. 9 Note that although the NBER peak and through months are February and March, respectively, peaks are allocated to expansions and troughs are allocated to recessions. Hence the Pandemic Recession duration is two months (March-April 2020), as per Table 1 . Note also that the April 2020 ending date was not determined and announced by the NBER until July 2021. 10 I emphasize again, however, that ADS, like the NBER recession chronology, tracks real activity growth, not level. Hence positive ADS does not necessarily mean "good times"; rather, it means "good growth", which may be from a very bad initial condition. That was the situation in May, as the battered U.S. economy resumed growth. (1) In the top panel I show the 2/28/2020 path. ADS has not moved. (2) In the second panel I show the 3/27/2020 path, which looks very different. ADS has become acutely aware of the disastrous situation; indeed most of the 3/27 path is well below the previous all-time (post-1960) ADS low during the 1970s oil-shock recession. 11 (3) In the third panel I show the 4/30/2020 path. The April initial claims news is bad, but less bad than March, which is good, and ADS shows a minimum in late March followed by a rise toward normalcy by the end of April. 11 It is also apparent that the Kalman smoother may be smoothing "too much", producing low ADS values well before mid-March, going back into February and even January. Its smoothing is optimal relative to the patterns in historical data, but the March initial jobless claims movements were unprecedentedly sharp. (4) In the fourth panel I show the 5/29/2020 path. The May news is very bad, dominated by the shockingly bad May 8 payroll employment number (for April), and the late-May path is massively down-shifted relative to the late-April path. The new minimum is in mid-April rather than late March, and the 5/29 ADS value is thoroughly dismal, nowhere near normalcy. (5) In the fifth panel I show the 6/26/2020 path. Thanks to the strong May payroll employment number (released June 5), ADS moved into normal territory, and stayed there. There is clear evidence for a Pandemic Recession trough in early/mid May, when ADS hits 0. In Figure 5 , I show the complete path plot during the Pandemic Recession through 6/26/2020, with the later-vintage path in red for comparison. The path plot is the set of all real-time paths; by following rightward through the sequence of paths, moving through time, I track the evolution of ADS beliefs about the chronology of business conditions. In Appendix A I provide a corresponding annotated path chronology. There are wide real-time divergences between individual early paths and the later vintage red path. There are interesting patterns, however, with several real-time "meta paths" evident: (1) The first extends through the 3/19/2020 ADS announcement. ADS does not move. Initial claims rise from 0.2m to 0.3m, a large move by historical standards, confirming what everyone already knew: the pandemic would have important real economic consequences, but the Kalman smoother optimally but erroneously ascribes it to measurement error. (2) The second meta-path begins with the 3/26/2020 and 4/2/2020 initial claims explosions. ADS plunges, but then recovers steadily despite a steady stream of bad news (it is bad, but getting less bad), almost back to 0 by the 5/7/2020 initial claims announcement. (3) The third meta-path begins with the horrific 5/8/2020 April payroll employment release, with ADS again plunging. It then again begins mean reverting, and does so completely when the strong May payroll employment number is released on 6/5/2020. In Figure 6 I show the corresponding "dot plot", with the 6/26/2020 path again superimposed. Each dot is the last observation of its corresponding path in Figure 5 . The dots are real-time filtered values, because smoothed and filtered values coincide for the last observation in a sample. The dot plot is highly volatile and emphasizes the various meta-paths. It is interesting to compare the third ADS (May) real-time meta-path to the late-vintage ADS path, and to the eventually-released NBER chronology, all of which appear in Figure 6 . Real-time ADS was gloomy throughout May, even as later-vintage ADS indicates a recovery beginning in early/mid May, and the NBER chronology similarly indicates a recovery beginning by the end of April. In real time the return to growth was not obvious until the June 5 release of May payroll employment. Because the March-April 2020 collapse in economic activity was obviously caused by COVID-19, it is of interest to directly examine the correlation between the two. I can do so at high frequency (daily), because I have both daily ADS and COVID new cases / deaths data. I want to correlate COVID new cases with ADS, but the direct new cases data are less reliable than deaths during the period of interest, because new cases were likely heavily influenced by changes in the amount of testing undertaken. Instead, a more reliable indicator of new cases is deaths, adjusted for the approximate 20-day period between infection and death. Hence I use deaths led by 20 days. 12 In Figure 7 I show ADS vs COVID deaths+20. 13 The strength of the negative correlation is striking. Of course economic activity plunged in March when COVID exploded, but there's much more than that -ADS and COVID continue to move in lockstep (inversely) through the April COVID peak, its April-May decline, and its June rebound. It is informative to compare the evolution and congealing of views during the Pandemic Recession to those during an earlier, more "standard", recession, like the Great Recession of 2007-2009. I can't examine real-time ADS when entering the Great Recession, because ADS did not start until December 2008, well after the great recession began. But I can examine it when exiting the great recession. In Figure 8 I show five paths in black, from ADS inception through the end of the Great Recession, at quarterly intervals. For comparison I also show a later-vintage path in red. It is revealing to compare the real-time paths to the ex post path. One can think of the later-vintage path as "truth", or at least a good assessment of truth based on later-vintage data. In the top panel of Figure 8 I show the first ADS path, 12/5/2008. ADS shows a very deep recession, almost the deepest on record since 1960, bottoming out in 2008Q3, with movement toward recovery in late Q3 and early Q4, even if it had stalled a bit by early December. As it turned out, however, the Great Recession subsequently featured a growth rate "double dip". The 12/5/2008 ADS path ends just after the first dip, which involved a sharp drop in September 2008 and an equally sharp rebound. 14 At the time it was easy to read the cards as saying that the recession was ending, and ADS was a bit too optimistic, moving upward toward recovery. Now consider the remaining panels of Figure 8 . In the second panel I show the next, and contrasting, 3/6/2009 ADS path. In the interim ADS has quickly learned the situation, the double dip in particular, and is very much on track, capturing the second dip in January 2009. ADS continues to climb steadily through the third and fourth panels (6/5/2009 and 9/3/3009, respectively), and by the time of the bottom panel (12/4/2009) it is clear that the Great Recession ended in June or July, with ADS basically fluctuating around 0 after that. (Recall that ADS=0 means average growth, not zero growth.) All told, the five quarterly real-time ADS paths generally match the ex post path closely, and they correctly identify the recession's end, well before the end of 2009 and indeed roughly 1.5 years before the official NBER announcement in September 2010. To emphasize ADS timeliness, I plot the later-vintage ADS in Figure 8 all the way through 2010, which allows inclusion of the NBER's end-of-recession announcement on 9/20/2010, long after the fact and not helpful for real-time decision making. 15 ADS fills the gap left by the late-arriving NBER chronology, and it also provides a numerical measure that allows one to track the recession's pattern, depth, overall severity, etc., in addition to duration. 16 For example, and as recorded in Table 1 , ADS identifies the Great Recession as the worst since 1960, with longest duration and third-greatest depth, resulting in the greatest overall severity (duration times depth). The Pandemic Recession, in contrast, was the deepest on record by an order of magnitude, but it was also the shortest by far, attaining a rank of third in overall severity (behind the 2007-2009 Great Recession and the 1973-1975 Oil Shock Recession). In Figure 9 I show the complete path plot. Of course there are errors positive and negative as the recession evolves, but overall ADS performs well, sending a reliable and 14 In particular, according to the Federal Reserve's G.17 Industrial Production (IP) release of October 16, 2008, September IP was severely affected by a highly-unusual and largely exogenous "triple shock" (Hurricanes Gustav and Ike, and a strike at a major aircraft manufacturer), which caused an annualized September IP drop of nearly fifty percent. A similar pattern exists for Manufacturing and Trade Sales (MTS). IP and MTS also rebounded unusually sharply in October -indeed IP appears to "overshoot"presumably in an attempt by manufacturers to make up for September's loss. 15 In fairness it must be noted that the NBER is not seeking to be helpful for real-time decision making; rather, it seeks to meticulously construct the U.S. business cycle chronology of record, quite reasonably using all relevant information -even very late-arriving information. valuable signal for navigating the path out of recession. I show the corresponding dot plot in Figure 10 . The dots are real-time smoothed values. 17 The black-dot sequence of real-time smooths is naturally less variable than the later-vintage (December 2010) smooth shown in red, because the latter has more information on which to condition, and therefore captures more variation. Our approach was part methodological and part substantive. On the substantive side, I explored how views formed using the ADS nowcast evolved when entering/exiting the U.S. Pandemic Recession, which arrived abruptly and then ended quickly. In particular, I tracked the evolution of real-time vintage beliefs and compared them to a later-vintage chronology. ADS real activity growth plunged wildly in March 2020 and swung in real time as its underlying components swung, but it returned to brisk growth by mid May. I also documented a strong negative relationship between the real-time ADS Pandemic Recession path and the concurrent real-time COVID-19 path, and I compared aspects of the Pandemic Recession and Great Recession paths. On the methodological side, I clarified the meaning of truly honest real-time nowcast/forecast evaluation and illustrated it using the ADS Business Conditions Index, which has now been in operation over a long span that includes emergence from the Great Recession, entry into the Pandemic Recession, and exit from the Pandemic Recession. An interesting methodological direction for future work would be decomposition of ADS movements into shares coming from the underlying indicators. One approach would be to use the observational weights implicit in the Kalman filter, as in Koopman and Harvey (2003) . Another approach, popular in the recent machine learning literature, would be based on Shapley (1953) values, as in Israeli (2007) . For initial explorations and insights, see Liu (2021) . 17 They are also filtered, because smoothed and filtered values coincide for the last observation in a sample. A Annotated ADS Chronology, 3/17/2020-7/2/2020 [Selected annotations, associated with large real-time ADS moves, appear in italics.] 03/17/2020, Industrial Production (for February 2020), data 09:15, ADS 10:45 This is the day of the last ADS update before the March 19 initial claims release. ADS continued its more-or-less random vibration around zero, sending the same signal that it had sent since the end of the Great Recession in 2009: the economy is growing normally. ADS=0.1. 03/19/2020, Initial Jobless Claims (for week ending 03/14/2020), data 08:30, ADS 10:00 IJC took a large move upward, suggesting that the pandemic would have important real economic consequences. The Kalman smoother optimally but erroneously ascribed this firsttime IJC jump almost entirely to measurement error, and ADS basically did not move. ADS=-0.2. 03/26/2020, Initial Jobless Claims (for week ending 03/21/2020), data 08:30, ADS 10:00 IJC spiked in jaw-dropping off-the-chart fashion. Two huge IJC moves in a row are not optimally ascribed to measurement error by the Kalman smoother; rather, they are naturally ascribed to the underlying serially-correlated real activity factor -and ADS drops to approximately -15 in similarly (and literally) off-the-chart fashion. By way of comparison, the all-time ADS lows since 1960 were in the recessions of 1973-1975 and 2007-2009 , in both cases between -4 and -5. Note that the ADS path now begins its drop earlier in the year, a result of the serial correlation in IJC interacting with the Kalman smoother. It is interesting to speculate as to whether real activity really was lower in February (say), due for example to the virus-induced January-February collapse of a major trading partner (China). ADS=-14.5. /27/2020, Real Personal Income Less Transfers data 08:30, ADS 10:00 IJC doubles off-the-charts, and ADS similarly doubles (downward) to -31. The Kalman smoother now has ADS beginning its decline in early January, again presumably an artifact of the serial correlation in IJC interacting with the Kalman smoother. Or, again, perhaps it's real /03/2020, Payroll Employment ADS evidently views the PE drop as good news, because it's not such a big drop compared to the off-the-charts drops Initial Jobless Claims (for week ending 04/04/2020), data 08:30, ADS 10:00 Another massive IJC increase, but ADS largely unchanged 45 IP plunges, but it's for the previous month, and ADS actually continues its gradual upward mean reversion as initial claims improve Initial Jobless Claims (for week ending 04/11/2020), data 08:30, ADS 10:00 IJC drops some, and ADS improves Initial Jobless Claims (for week ending 04/18/2020), data 08:30, ADS 10:00 IJC and ADS again improve been as bad as late March, 2020Q1 GDP growth would have been massively worse, consistent with the massive late-March ADS drop. ADS is essentially unchanged Initial Jobless Claims (for week ending 04/25/2020), data 08:30 Real Manufacturing & Trade Sales ADS 10:00 PILT for the previous month down sharply. The day's news is all bad, yet not so bad as it was, and ADS improves ADS 10:00 IJC again improving. The IJC numbers continue to be bad, but they are getting less bad, and ADS seems driven by that. By this time the path plot makes clear that new data are causing sizable revisions in entire paths. For example, the huge ADS trough was estimated to be approximately -32 in the 4/2 vintage, but it was progressively moved upward in subsequent vintages Payroll Employment Plunges downward, and ADS plunges similarly to an all-time low /14/2020, Initial Jobless Claims (for week ending 05/09/2020) ADS 10:45 Plunges but ADS nevertheless ADS improves. ADS=-26.9 /21/2020, Initial Jobless Claims (for week ending 05/16/2020) /28/2020, Initial Jobless Claims (for week ending 05/23/2020) Real GDP (first quarter 2020, second release), data 08:30 Real Manufacturing & Trade Sales Real Personal Income Less Transfers Initial Jobless Claims (for week ending 05/30/2020), data 08:30, ADS 10:00 IJC continues its ever-so-slow reversion to normalcy /05/2020, Payroll Employment /11/2020, Initial Jobless Claims /16/2020, Industrial Production /18/2020, Initial Jobless Claims (for week ending 06/13/2020) /25/2020, Initial Jobless Claims /26/2020, Real Manufacturing & Trade Sales April MTS was weak as expected /26/2020, Real Personal Income Less Transfers May PILT was strong, returning to positive growth Initial Jobless Claims (for week ending 06/27/2020), data 08:30 Real-Time Macroeconomic Monitoring: Real Activity, Inflation, and Interactions Real-Time Measurement of Business Conditions Forecasting Economic Time Series Using Targeted Predictors Nowcasting Are More Data Always Better for Factor Analysis? Measuring Business Cycles Real-Time Real Economic Activity Entering the Pandemic Recession Forecasting Output with the Composite Leading Index: A Real-Time Analysis An Interactive Web-Based Dashboard to Track COVID-19 in Real Time A Quasi-Maximum Likelihood Approach for Large, Approximate Dynamic Factor Models Time Series Analysis by State Space Methods Forecasting, Structural Time Series Models and the Kalman Filter A Shapley-Based Decomposition of the R-Square of a Linear Regression Computing Observation Weights for Signal Extraction and Filtering Understanding Real Economic Activity During COVID-19 Using a Dynamic Factor Model Decomposition FRED-MD: A Monthly Database for Macroeconomic Research Business Cycle Modeling Without Pretending to Have too Much a Priori Theory A Value for N-Person Games New Indexes of Coincident and Leading Economic Indicators Forecasting Using Principal Components from a Large Number of Predictors