Monday, December 30, 2024

Forecasting and Scenario Construction

 


Of course, we cannot know The Future. Think about trying to predict the Future in 1900: World War I, The Great Depression, World War II, the Cold War, etc. Even after the fact, we have trouble explaining what happened in the InterWar Period. Yet, we persist. We try to predict the effects of Climate Change. We try to predict the path of Hurricanes. We try to predict next Quarter's Economic performance

Our vision for all this effort is (1) Science Fiction Psycho-Historian Harry Seldon created a hand-held device called the Prime Gradient that predicts the collapse of the Galactic Empire in the Foundation Trilogy and (2) Limits to Growth Engineer J. Wright Forrester created a computer program, World1, that predicted the collapse of the World System in 2050 as a result of resource shortage. These aren't the work of cranks. Issac Asimov was a scientist. J. Wright Forrester taught at MIT.

It is a mistake to think anyone knows the future. It is unknowable. What I think we can do is Explore the Future with state space models, systems analysis, multi-model inference and scenario construction. Some scenarios constructed in this manner are too awful to contemplate and must be avoided at all costs. An entire group of middle-range scenarios will seem likely but the best model cannot be chosen in advance. Systems are too complex and there is significant random error. An example of my approach is Five Futures for Russia in which I use five models of the Russian SocioEconomic system to construct statistical scenarios for the future (one is a statistical surprise). You can actually run the Business as Usual Model (BAU) on line here.

Computer simulation of statistically estimated systems models is essential. We have to get beyond the stage of arm-chair speculation which still seems to be the privileged mode of academic discourse based on The Classics. When faced with having to make predictions about the future of Climate Change, the science-based IPCC made the right choice: simulation and scenario construction. The Social Sciences have supplied few useful models for the IPCC project. Neoclassical Economics has provided the DICE Integrated Assessment Model but it is based on flawed neoclassical assumptions and is not statistically estimated or tested.

Here is some more detail on my approach. The methodology is all readily available and it remains for the Social Sciences to make a serious effort to apply it.

Atlanta Fed Economy Now

My approach to forecasting is similar to the EconomyNow model used by the Atlanta Federal Reserve. Since the new Republican Administration is signaling that they would like to eliminate the Federal Reserve, the app might well not be available in the future.

One important comment about the Atlanta Fed GDP Now App. The underlying forecasting model is based on the work of Stock and Watson (2012) on Diffusion Indexes. In Systems Theory, the diffusion index should be interpreted as a set of approximate state variables (essential variables) for a State Space model. The actual state of the system is then computed using the Kalman Filter and estimated using the dse package (see below).

Hurricane Forecasting

My vision for SocioEconomic system forecasting is to follow the US National Oceanic and Atmospheric Administration's (NOAA) approach to hurricane (Economic Crisis?) forecasting using Spaghetti Models.


Currently, Economic forecasting does not use Multimodel Inference but it is getting there! Model selection is based on the AIC Criterion.

Climate Change

My approach takes the IPCC Emission Scenarios and generalizes them to include many other variables, not just CO2 emissions and Global temperature. These scenarios are for the World System. Needless to say, the new Right-Wing Republican administration plans on withdrawing the US from all attempts to study or ameliorate Climate Change.


Compare the graphic above to my Alternate Futures for the US.

Data Sources and Estimation

There is a wealth of historical data sources that remain to be exploited for statistical analysis. My primary sources for the Late Twentieth Century are the World Development Indicators which contains a treasure trove of historical data on every country in the modern World-System. For longer-term historical analysis I use the Historical Statistics for the World Economy and the Maddison Historical Statistics Project. For detailed models of individual economies I use each country's historical statistics (for example, the Historical Statistics of the US or the European Historical Statistics). To deal with missing data I use the E-M (Estimation-Maximization) Algorithm and Nonlinear Spline Soothing. For model estimation I use the dse package in the R programming language. You can run R-code on line using the Snippets web-based service. My main purpose in using historical data is to develop models, not to understand the true course of World History.

Thursday, December 10, 2020

Confounded COVID Vaccine Trial


Figure 1. COVID incidence in RCT (source: VRBPAC briefing document)

December 10, 2011. Today, the FDA Biological Products Advisory Committee (VRBPAC) met to discuss the request for emergency use authorization (EUA) of a COVID-19 messenger RNA vaccine from Pfizer, Inc. My guess is that the graph above will be critical to approval of the vaccine. One week after injection, the experimental and control (placebo) groups diverged. The control group (red squares) showed continued COVID infections while the experimental group that received the vaccine, showed very few new cases. Even though the graph appears to be powerful evidence of the vaccine's effectiveness, I have some questions about the study's methodology.

Figure 2. Questions for the FDA

I was unable to find answers to my questions in the documentation submitted to VRBPAC so I submitted my questions to the FDA (graphic above). My general concern is about confounding, that is, other explanations that could account for the results of the Randomized Control Trial (RCT).  

Figure 3. US COVID Daily Counts (source Healthdata.org)

My concerns are based mainly on the graph above that shows a plot of actual and projected Daily COVID case counts under four conditions: (1) Mandates for masks and social distancing easing (red), (2) Universal Masks (green), and (3) Rapid Vaccine Rollout (blue). The Current Projection is presented in purple. What catches my attention in the graph is that Rapid Vaccine Rollout is not much better than the current projection after four months (the farthest into the future the projections are being made). The most effective way to control daily COVID case counts is universal masking (green line vs red line).

My questions to the FDA are focused on the behavior of all subjects after vaccination. Assume that Figure 3 rather than Figure 1, was the result of an imaginary RCT. The experimental group (Rapid Vaccine Rollout) is now the blue line and there are three control groups: Current Projection (purple), Mandate Easing (read) and Universal masks (green). In this imaginary experiment, the vaccine would still be more effective than the control groups, but not by much compared to Current Projection and Universal masking. Look back at the Y-axis of Figure 1. The difference between experimental and control at Day 112 is about 0.02 (cumulative incidence), which looks a lot like the difference between Rapid Vaccine Rollout and Universal Masks in Figure 3.

In RCT, it is typically not possible to control what the subjects do after vaccination. John Yang, a PBS NewHour reporter who was a subject in the RCTreports that he was told to go home and continue his routine, which involved staying home, social distancing and mask wearing. He knew very quickly that he was in the experimental group because he developed rather strong symptoms after vaccination. If his was a common experience, than blinding in the RCT was broken and may have influenced the behavior of the experimental subjects. Subjects maintain a diary where they list side-effects and events. What is not clear is whether subjects recorded social distancing and masking.

There are a few other issue I have with statistical analysis of the data: (1) Trials were conducted in multiple countries (Argentina, Brazil, South Africa and the United States) and multiple sites. The design is called a Multicenter Clinical Trial (MCT). The appropriate statistical model is a Hierarchical Linear Model (HLM) that allows for the analysis and control of differences across sites and countries. For example, mask use differs across countries: Argentina (90%), Brazil (60%), South Africa (70%) and the United States (70%) (source: Heathdata.org). The HLM controls these and other differences across centers. For example, imagine that all the COVID cases in the control group were from Brazil. Whether or not an HLM was used in the analysis of the Pfizer study is not made clear in the documentation. (2) Testing of the assumptions underlying the statistical model are not reported. The dependent measure is a risk ratio comparing experimental and control groups. Ratios are known not be be normally distributed and must be transformed prior to analysis. The effect of the transformation in normalizing the data should be tested.

If I learn anything more from listening to the ongoing  FDA VRBPAC hearing, I will report them as comments.