*By Dennis Coyne*

The Railroad Commission of Texas (RRC) has recently reported new data for oil and natural gas output through August 2016. Dean Fantazzini has kindly shared his corrected data using the most recent data from the RRC. He uses a statistical procedure which adds up the changes in the RRC data from April 2014 to July 2016 to see how incomplete the data has been in the past and uses this to estimate the "missing barrels of oil and cubic feet of natural gas" that will be added to the current "incomplete data" over the following 24 months. In the past, the RRC data has been about 99% complete when you look back 24 months from the most recently reported month. Dean estimates the "correction factors" which need to be added to the reported data to get a more reliable estimate of recent output levels.

The correction factors for the month of August looked very low compared to the historical correction factors so I asked Dean to check for a statistical break in the correction factors. Essentially, in the past, there has been no statistical trend in the correction factors based on Dean's analysis, but I wondered if perhaps there was now a downward trend in the correction factors due to the digitization of reporting by the RRC.

I will quote Dean's findings below (from an e-mail):

I checked the time series for each correcting factor - for crude oil only - using unit root tests with a breakpoint, and I found that the correcting factors for the latest 6 months are non-stationary (even at the 1% level), with a break in the constant which took place in February 2016. The previous months (older than 6 months) are instead stationary.

The effect of the ongoing digitalization process seems to (finally) appear in the data. However, many more data will be needed to confirm the break in the data structure: for example, the break in the constant is significant only at the 5% probability level, but not at the 1% level.

Given this evidence, reporting both the corrected data using all the vintage data and the corrected data using the last 3 months (to take the structural break into account) may be a wise thing.

I decided to show the correction based on the last 6 months rather than 3 months because that is where the break occurs, though the difference between 3 months and 6 months is not significant (a difference of 12 kb/d less on average each month). I also show the previous method of using all the data (January 2014 to August 2016 for oil and April 2014 to August 2016 condensate), this is called all vintage in the chart that follows.

For July 2016, the 6-month estimate is 161 kb/d higher than the EIA estimate and the all vintage estimate is 235 kb/d higher than the EIA estimate.

Data for TX C+C below is from July 2015 to July 2016, first column is all vintage, then 6-month, and then EIA all in kb/d

3452 3451 3452

3429 3427 3413

3436 3429 3415

3431 3421 3404

3436 3424 3409

3398 3383 3348

3443 3424 3361

3424 3401 3315

3408 3382 3295

3396 3363 3245

3375 3333 3193

3380 3321 3172

3396 3322 3161

Dean also provides data on how his estimates have changed over time. In the chart below, I show Dean's Texas C+C corrected estimates (using all vintage data) from June 2015 to August 2016, where the month is the final data point of the estimate. The recent estimate is lower than the previous 3 months; in the past, the correction factors have bounced up and down by quite a bit, so potentially this could change, particularly if we focus on the 6-month corrected estimate, the estimate will be more volatile.

The chart below shows how the correction factors have changed over time. Statistically, we see no trend in the correction factors from April 2014 to February 2016 (the correction factors are "stationary"), from February 2016 to August 2016, we see a downward trend significant at the 5% level.

The natural gas corrected estimate is compared with the EIA estimate below.