An Empirical Model For Oil Prices And Some Implications

by: Ron Patterson

By Ian Schindler

The views expressed do not necessarily reflect the views of Ron Patterson or Dennis Coyne.


This work is preliminary. It is a preview of part of a paper I am writing with Aude Illig. There are three main reasons I am making this post. The first is as a public service. There are many people reading this blog who are directly affected by oil prices and who have to make decisions based on future oil prices. Having a model to understand the dynamics of oil prices is of use to them. The second reason is that some people reading this blog model oil extraction. These models either omit price considerations or make assumptions on them. Our model is a large improvement on these assumptions so it should improve their extraction models. The final reason is that I consider the quality of the comments on this blog to be high. I believe that the feedback I get from this post will improve the quality of the final paper. Indeed, Dennis Coyne has already provided valuable feedback after previewing the post. This study has been a humbling experience. Get ready to throw out everything you thought you knew about oil prices.

The model does not by any means explain all oil price variation. What is remarkable is that with only one data set, it explains so much. Many factors may affect the price of oil. This model provides a base to which other variables can be added to find what explains oil prices.

I was asked to write a chapter titled, "Strategies for an Economy Facing Energy Constraints", for a book last year which I wrote with my daughter. I do not think the book will be published but the chapter may be of interest to some. I have posted the pdf file online and will refer to it often [2].

I prefer the terminology of Turchin and Nefedov to the term "peak oil". Because oil is a finite resource, it will have a growth phase, a stagflation phase, and a decline or contraction phase. Turchin and Nefedov characterized these phases in agrarian civilizations [3]. The phases of oil extraction have similar characteristics. I believe that the growth phase of oil ended around 2005 and that the stagflation phase ended towards the end of 2014. The sign that the stagflation phase ended was the drop in oil prices.

The Cost Share Theorem from neoclassic equilibrium theory says that oil and food are not very important in economic production because their cost shares are small. Our price model is consistent with the opposite view: that oil extraction has been extremely important in economic production. In [2] computations show that the dynamics of the cost share is the indicator of importance in an economic production function. In particular, the cost share of important factors have negative derivatives, that is they shrink during periods of economic growth, and grow during periods of economic contraction.

An Empirical Model

George Box said that all models were wrong, but some models are useful. Our goal is to use historical oil extraction data to explain oil prices. If one can explain a large part of oil prices with this data, it can be used to understand what made oil prices move in the past. One can then make predictions with the assumptions that, at least short term, past conditions have not changed too much. If the predictions do not match future prices, this is also information. It means that some past condition is no longer verified. The model can help to determine exactly what condition is no longer verified.

We used price and extraction data for crude, condensate, and NGL from BP's (NYSE:BP) 2015 Statistical Review because the extraction data goes back to 1969 (actually 1965, we were also looking at cost share and our data set for GWP only went back to 1969). We wanted to include price shocks of the 1970s.

By p(t) and q(t) we will denote the price of oil and the quantity of oil extracted (in barrels) in year. The data immediately give us a model for the price of oil. It is the average price for the period. Clearly, this is not a good model because there are large variations from this average value. What could cause these variations? From Figure 1, one sees that quantities cannot explain price because p is not uniquely determined by q, in other words, several prices correspond to the same quantity produced. So we attempt to use autocorrelation: we attempt to explain p(t) as a function of q(t), q(t-1), q(t-2), etc. It is well known that the more variables one uses, the less robust the model is. When we say robust, we mean that the reliability of the model for predicting future prices is impaired with too many variables. For example, if we used all the variables q(t-k), k=1, … 44, we could explain all the variation from the mean, but the information would be useless for predictions because we would have been too greedy. There are other factors that effect price such as the weather, strikes, earthquakes, interest rates, financial bubbles, etc., that are not included in the extraction data. Better predictions are made from fewer variables and a fit which is not exact.

Image PrixQuantite

Figure 1: Price vs. quantities

We define

Dq(t) = q(t) - q(t-1) (2.1)
DDq(t) = q(t)-2q(t-1)+q(t-2) (2.2)

Note that Dq(t) and DDq(t) are the discreet first and second derivatives of q(t) with time step h=1. The vectors q(t), Dq(t), and DDq(t) span the linear space generated by q(t), q(t-1), and q(t-2) so our methodology is equivalent to using the latter variables. We prefer the former variables because the results are easier to interpret with these variables. I will discuss the following model:

log(log(p(t)))=a + bq(t) -cDq(t)+dDDq(t) (2.3)

Where a, b, c, and d are positive constants determined by linear regression (a priory it was not known that they were positive, this was determined by the regression). After reviewing this post, Dennis Coyne generously shared his EIA data from 1960. We get a much better fit with the EIA data which only includes crude and condensate production and no NGL so we will use it for the paper.

Equation (2.3) is equivalent to

p(t)=exp(exp(a+bq(t)-cDq(t)+dDDq(t))) (2.4)

The reason that the best linear regression worked for the log(log(p(t))) is because the dependency of price on these variables is non-linear. Linear regression tests for affine functions. The log flattens large values. The log(log) really flattens large values and it was a great surprise that this model gave the best fit with the data we used. This corresponds to the inelasticity of oil prices. That is, small changes in supply provoke large changes in price.

The R output for the regression is as follows:

Call:lm(formula = log(log(Price71)) ~ Quantity71 + DQuantity71 + DDQuantity)Residuals:Min 1Q Median 3Q Max-0.25737 -0.08838 0.01609 0.08395 0.26496Coefficients:Estimate Std. Error t value Pr(>|t|)(Intercept) 8.955e-01 1.299e-01 6.895 2.62e-08 ***Quantity71 1.944e-05 5.098e-06 3.814 0.000464 ***DQuantity71 -1.412e-04 3.667e-05 -3.852 0.000415 ***DDQuantity 5.847e-05 2.821e-05 2.072 0.044717 *---Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1Residual standard error: 0.1269 on 40 degrees of freedomMultiple R-squared: 0.4115, Adjusted R-squared: 0.3674F-statistic: 9.324 on 3 and 40 DF, p-value: 8.45e-05

Adjusted R-squared is 0.367 which means that the model explains 36.7% of the variance from the mean taking into account the number of variables. In other words, a large part of what is normally called demand is determined by the offer, and it's first and second derivatives. Frequently demand is estimated by economic growth. Oil price is strongly correlated to GWP. Thus, three years of oil extraction can give a good estimate of GWP. This is strong evidence that oil extraction has been a major determinant in economic production. This is much more reasonable than to imagine that GWP this year somehow determined oil extraction this year, last year, and the year before.

The intercept is the coefficient a, Quantity71 b, DQuantity71 -c, and DDQuantity71 d.


If extraction is constant for two years, then Dq(t)=DDq(t)=0. In that case, the model reduces to:

p(t)=exp(exp(.8955+1.944*10^(-5)q(t))) (3.5)

We call this the basic price formula that predicts the price if extraction is constant. Note that the basic price is an increasing function of quantity. This is another indication that oil extraction is a very important part of economic production because it indicates that the price divided by the cost share is increasing indicating super-linear scaling in GWP [2, Theorem A.1.1]. In Figure 2, we graph the basic price as a function of quantities. Because the exponential is such a fast growing function, one must be careful extending the model to extraction levels much larger than current levels.
Image base_function

Figure 2: The Basic Price vs. Quantities

Suppose Dq(t) is a constant so DDq(t) =0. Then the model becomes:

p(t)=exp(exp(.8955+1.944*10^(-5)q(t) - 14.12*10^(-04)Dq(t)))(3.6)

Note that the coefficient of Dq(t) is about 7 times larger than the coefficient of q(t) and of opposite sign. Thus, gives a much larger price signal than q(t). The signal goes in the opposite direction of the change but it only lasts for a year. One might understand this as follows: a rise (fall) in extraction influences the economy much more next year than this year. Imagine that an increase in extraction levels occurred the first of the year. The economy is using a certain quantity of oil at the basic price and suddenly on the first of the year an extra million barrels a day is delivered. In order to unload this extra oil, the price must drop. It takes a while to figure out what to do with the extra oil, but by the end of the year, it is sorted out and someone is using the oil to get some work done. This produces economic growth, and thus, if the same quantity is produced the next year, people will be able and willing to pay a higher price to continue using the same amount of oil. Thus the price rises.

Example 1

If extraction is constant at 80 mbd the basic price is:
p(t)=$75. If q(t)=80, q(t-1)=78, and q(t-2)=76 then Dq(t)=Dq(t-1)=2 and DDq(t)=0. Then p(t)=$49 because extraction is rising. If q(t)=80, q(t-1)=82, and q(t-2)=84 then Dq(t)=-2 and DDq(t)=0. Then,
p(t)=$120 because extraction is falling.

Dependence on the second derivative is more subtle. The second derivative is negative at local maximums and positive at local minimums so that the second derivative will mollify the price change caused by the first derivative. This explains why peak extraction is frequently associated with low prices. A minimum in extraction will thus be associated with relatively high prices. Economically this factor can be interpreted as follows: it takes two years for the economic growth (contraction) produced by an increase (decrease) in extraction to take hold. The first year it is rather fragile and easily reduced (increased) by a drop (rise) in extraction.

Example 2

  1. If q(t)=80, q(t-1)=78, and q(t-2)=80, then extraction reaches a local minimum at q(t-1). We compute p(t)=$70 rather than $49 as in Example 1 with the same increase in extraction from 78 mbd.
  2. If q(t)=80, q(t-1)=82, and q(t-2)=80, then extraction reaches a local maximum at q(t-1). We compute p(t)=$81 rather than $120 as in Example 1 with the same decrease in extraction from 82 mbd.
  3. It is interesting to note that if q(t)=Aρ^t with ρ=(1+r) and the growth rate r in a reasonable range (0<r<.12), then p(t) is an increasing function of q(t). For example, if q(t)=80(1.02)^t, then p(3)=$58.3<$60.8=p(4), thus increasing production at a constant rate produces increasing prices. However, this price is lower than the basic price at the same extraction quantity which is $87.6.

In Figure 3, we plot the fitted model with the actual data with the model's prediction for 2015 (based on an increase of 2.2 mbd from the last data point of 92 mbd in 2014). Note that the model does not do well with extreme prices, either high as in the 1970s, 80s, and from 2005 to 2014 or low at the end of the 20th century. This is because the model is adjusting to average prices and so will in general be between the observed price and the mean. In Figure 4, we plot the fitted model vs. the actual price with the EIA data provided by Dennis Coyne.
Figure 3:

Fitted model with BP data discussed
Click to enlarge
Figure 4:

Fitted model with EIA data from Dennis

For those interested in using the model produced from EIA data, a=0.779552, b=.009443, c=0.058792, and d=0.023649.


The growth phase of oil extraction was characterized by increasing prices well below the basic price of each level of extraction due to short-term price signals resisting growth. The high prices during the stagflation phase of extraction can be explained by lower growth in extraction. Thus, the short-term signals are weaker and prices are closer to the basic price. With production of 92 mbd, the basic price in 2014 was $110/barrel.

We outline 3 possible scenarios for the contraction stage of oil extraction. We assume that peak extraction is 95 mbd which gives a basic price of $120.

  1. The first scenario is a steady decline in oil extraction. If extraction falls at a constant rate of 1% per year from 95 mbd, then q(t)=95*.99^t, from which we compute q(2)=93 and q(3)=92. We obtain p ^(2)=$145and p ^(3)=$140. Thus, the price is high but decreasing. As extraction continues to decline, higher priced extraction will eventually be priced out and closed. This will lead to a faster decline in extraction, lower prices and thus a negative feedback cycle. This corroborates a phenomenon described in [3]: stagflation occurs because the civilization has reached the carrying capacity of the land. Peasants leave the countryside for the cities as they can no longer make a decent living in the countryside even as food production stagnates.
  2. The end of the stagflation period is characterized by civil war among the elite class [3]. If one equates the oil extraction industry with the elite class, the 2014-2015 price war can be seen as the beginning of a civil war among the elite class. Well financed expensive production (such as fracking) can keep production high to eliminate lower price competition (conventional) in the mistaken belief that lower extraction rates mean higher prices. Eliminating this competition initially creates higher prices due to higher decline rates, but faster decline in extraction quantities leads to faster declines in the basic price which will lead to a faster overall decline in oil extraction.
  3. A Seneca cliff [1] can be imagined if for example a sudden drop in non OPEC production coincides with war in the Persian Gulf. If extraction rates fall precipitously and remain low for two years, the ensuing drop in prices will decimate the extraction industry and a recovery will be highly unlikely.

Note that in all three scenarios, price considerations from the model speed the rate of decline in extraction.


The model presented supports the thesis that a large part of economic production is generated by increased extraction. The big surprise is that prices increase with extraction levels. The most powerful price signals come from short-term effects lasting no more than 2 years. It is hoped that this model will aid modelers, policy makers, and investors understand the dynamics of the contraction phase of oil extraction. Modelers can now estimate the price of oil from their extraction numbers. From the price, investment in oil extraction can be estimated which will lead to more reliable estimates of future oil extraction. The model can also be used to assess climate change mitigation policies. If a policy results in prices significantly below predicted values, it can be considered successful. The model might also be useful in predicting financial bubbles. We believe that our methodology can be extended to extraction of other fossil fuels and sources of energy.


Ugo Bardi.
The Seneca effect: why decline is faster than growth.
Blog, 2011.

Ian Schindler and Julia Schindler.
Strategies for an economy facing energy constraints, 2015.

Peter Turchin and Sergey Nefedov.
Secular Cycles.
Princeton University Press, 2009.