Forecasting

Forecasting is Hard – Especially for the Future

February 12, 2020

Predicting behavior in the future is no easy task. Yet, we energy forecasters do this every day. The complexity of the problem should not be understated. Behind all of the software and the complicated models, we are attempting to predict how humans will act today, tomorrow and even 20 years from now.

In the context of short-term operational forecasting, the key unknowns are weather and solar conditions. Here, we depend upon third-party weather vendors to predict accurately various weather concepts, including dry bulb temperature, dewpoint temperature, cloud-cover and solar irradiance.

When we build statistical models (i.e., regressions, neural networks, etc.), we are estimating the relationship between load and a variety of other variables. In so doing, we are implicitly assuming that the relationship captured by the model (via the coefficients on the variables) in the past will persist into the future. The variables include many factors that we know with certainty: day-of-week, month and holidays. We know without question whether tomorrow is Thursday or Friday, or if tomorrow is Martin Luther King Jr. Day or New Year’s Day. We know with much less certainty whether tomorrow will be hot or cold, or if the sun will be obscured by clouds. Further, the accuracy with which the weather vendors can predict these concepts degrades as we go further into the forecast horizon. It is easier to predict tomorrow’s temperature than it is to predict the temperature a week into the future. None of this is especially surprising or revelatory, but these ideas are often assumed, rather than explicated.

In the context of medium- and long-term models, which extend 1 to 50 years into the future, we must depend even further upon others, including economic vendors (to provide household and GDP forecasts), the U.S. EIA (Energy Information Administration – to provide saturation and efficiency trends of major end uses) and utility staff (to provide estimates of demand-side management – DSM). That is most certainly a lot of moving parts, which are outside of our control, unless of course, we feel empowered and motivated to change those values ourselves.

Let’s think briefly about economic forecasts. The U.S. Bureau of Economic Analysis (BEA) produces the historical estimates of Gross Domestic Product (GDP). It is well-known that U.S. GDP values are frequently and substantially re-stated. Q3 GDP values could potentially be revised from 1.9% annual growth to 2.1%. In this example, that is a 10.5% change calculated as (2.1/1.9) – 1! Without putting too fine a point on it: a 10.5% adjustment is a lot. There are three important points here:

This is merely the historical value. We are not even looking at the forecast yet.
We are focusing on U.S. GDP. If this national value is updated so substantially, how confident can we be in the Gross State Product (GSP) or Gross Metro Product (GMP), which are at dramatically lower levels of aggregation?
The official numbers are at a quarterly frequency. In many cases, we utilize monthly values, which have been interpolated (via some mathematical approach) from the quarterly values.

To be fair, these numbers are based on surveys, statistical methods and various data collected by the government. The GDP numbers do not fall out of the sky and magically appear on the desks of government employees. Indeed, humans (with the help of computers) report and calculate these statistics. There is much space here for error. It is not as if there is some kind of metering device that collects all the data on every transaction in our economy and transmits it to the federal government for quick and easy reporting – that is not how any of this works.

Now, let’s think about forecasting this constantly moving target. If we have little confidence in the historical GDP value from last quarter or even the prior quarter, how confident can we be in the forecast for the coming year? By way of analogy, let’s imagine we are generating load forecasts every five minutes, based on updated weather conditions and the most recently reported historical five-minute load data. If the deltas of the most recent observations were bouncing around by a factor of 10%, it would come as exactly no surprise to anybody if our near-term load forecast would be (how shall I put this?) bad.

We can make similar arguments about the saturation and efficiency drivers. The long-term weather forecast has its own set of issues: it is typically based on some measure of ‘normal’ weather, which can be calculated variously. Maybe we use a 10-year normal or a 20-year normal or a trended-normal. Again, there is much space for interpretation and for error.

The point of this is not to denigrate the economic vendors or the EIA, but rather to bring a few issues into the daylight for evaluation. The fact that there are GDP numbers and saturation/efficiency drivers at all is a big accomplishment. We also do not let a lack of data or a lack of confidence in the data stop us from generating forecasts. We must do the best we can with the tools we have available to us!