Seeking Alpha

Ivan Kitov's  Instablog

Ivan Kitov
Send Message
I am a Doctor of Physics and Mathematics, Lead Researcher at the Institute for the Geospheres' Dynamics, Russian Academy of Sciences. Founding member of the Society for the Study of Economic Inequality Published three monographs in economics and finances: Deterministic mechanics of pricing... More
My company:
Stock Market Science
My blog:
Economics as Classical Mechanics
My book:
Deterministic mechanics of pricing
• Personal Income Distribution In The US
We are going to revisit our model for personal income distribution, PID. It was first formalized in 2003 and used income distributions through 2001. We had to convert all reports published by the Census Bureau in pdf format between 1947 and 1993 into excel tables. It took a month of hand work together with proof reading. These reports are not converted into digital format yet.

In 2006, we used new data (through 2005) and re-estimated the model. In 2010, we published a book on personal income distribution using data through 2008. It is a good time to refresh the model and evaluate its performance since 2001 with ten more years of data. All major results will be presented in this blog.

We start with presenting original data. The distribution of personal incomes since 1994 is characterized by a higher resolution - income bins are only \$2500 wide. Our model assumes that the overall income distribution depends on the age pyramid and the level of real GDP per capita. However, the evolution of PID is slow and at a twenty year horizon one actually sees a frozen PID. The frozen PID results in an almost constant Gini ratio over time, which is actually reported by the Census Bureau.

We illustrate PID in a few figures below. Figure1 presents all PID published since 1994 between \$0 and \$100,000 as they are. We have included all people without income into the bin between \$0 and \$2500. One can observe that the number of people in higher income bins increases with time as well as the number of people with incomes above \$100,000 shown in Figure 2. The portion of people with incomes above \$100,000 has been increasing by 0.3% per year since 1994. Figure 3 shows the number of people with income above \$100,000 as a function of work experience. The fastest growth is observed for the groups between 30 and 40 years of work experience, i.e. between 45 and 55 years of age.

Figure 4 depicts the population density functions, PDFs, for the years between 1994 and 2010. First, the estimates presented in Figure 1 were normalized to the total population for a given year. Then we reduced the income scale for individual years, i.e. from 1995 to 2010, by the total growth of real GDP. This allows normalizing the curves to the total income, i.e. we reduce all scales to that of 1994. Finally, we normalize the portions of populations in given bins to their widths for individual years and obtain the population density functions. Figure 4 proves that the distribution of personal incomes has not been changing over time in relative terms, i.e. a given portion of population always has a given portion of total income. From the PIDs one can always build the relevant Lorenz curves and estimate Gini ratios. For higher incomes, the distribution has to be described by the Pareto distribution. Figure 5 shows that the PDFs at higher incomes do follow a common power law with an exponent of -3.9.

Our first assessment of the income data obtained after 2001 is that they do follow up the previously obtained relationships. We expect that our model for personal income distribution should perform well.

Figure 1. Personal income distributions from 1994 to 2010.

Figure 2. Portion of people with income above \$100,000. The portion increases by 0.3% per year.

Figure 3. The number of people with income above \$100,000 as a function of work experience. The fastest growth is observed for the groups between 30 and 40 years of work experience, i.e. between 45 and 55 years of age.

Figure 4. Population density function, i.e. the number of people in a given bin normalized to the total number of people and the width of income bin, as a function of income reduced by the overall GDP growth.

Figure 5. The Pareto distribution at higher incomes.

Disclosure: I have no positions in any stocks mentioned, and no plans to initiate any positions within the next 72 hours.

Tags: demographics
Feb 06 10:34 AM | Link | Comment!
• Krugman And Damned Lies About Income Inequality. No Politics
Paul Krugman and a bigger company have been speculating on the increasing economic inequality in the US. They do not trust any data from the BLS (income measurements obtained during Current Population Surveys) and deny that income data from censuses can be used to characterize Gini coefficient since these data sets do not contain higher incomes. They claim that the most interesting processes have been evolving at very high incomes. In this post, I am going to justify the estimates of Gini reported by the BLS. My goal is to extend the distribution of personal incomes to as high level as possible and to demonstrate that this distribution follows up the Pareto distribution, i.e. is well described by a simple power law. This observation allows replacing (interpolate) actual measurements with a simple function when calculating the Lorenz curve and thus Gini coefficient.

Following this direction, we have recently reported that the personal income distribution, PID, in the USA does not change with time when normalized to the total population and total income. In other words, the relative distribution of personal income in the United States has not been changing since the start of income measurements in 1947. The accuracy of early measurements is not good enough, however, and we have to rely of the most recent results.

The US Census Bureau routinely reports income estimates obtained during the Annual Social and Economic Supplement of the Current Population Surveys. We begin with the higher income range as reported by the BLS and have retrieved the population distribution over mean income in the range from \$0 to \$250,000. These distributions are available only from 2000. The relevant measurements of the number of people in a given income range were carried out in \$2500 bins between \$0 and \$100,000 and \$50000 bins between \$100,000 and \$250,000.

The personal income distributions, as reported by the BLS in current dollars, are affected by the change in population (working age population), and nominal GDP growth. Also the width of income bins varies with income level. Therefore, one cannot directly compare PIDs obtained in different years. In order to suppress the influence of the width we have calculated the population density, i.e. the ratio of the number of people in a given bin and its width. Since the personal income is measured in current dollars we have to reduce all incomes by the total change of the GDP deflator since 2000 to a given year. Figure 1 shows the result of normalization for 2000, 2005, and 2010. In relative terms, the income distribution has not been changing since 2000. At higher incomes, all three curves are practically identical. This observation is validated by the estimates of Gini coefficient provided by the Census Bureau. There is a high income cap of \$250,000 (all incomes above the cap are gather in one group), which is used by Krugman and company to deny the BLS estimates.

Let's take a look the data they used to prove the increasing inequality. The IRS measured incomes are usually referred to. Without loss of generality, we have retried "Table 1.1 Selected Income and Tax Items, by Size and Accumulated Size of Adjusted Gross Income, Tax Year 2009". (Any other year between 1996 and 2009 is good as well.) This Table lists individual incomes in various income bins from \$1 to \$10,000,000. There are also 8274 reports of income above \$10,000,000. We cannot use the latter incomes but definitely can plot the population density function for all incomes below \$10,000,000. Figure 2 depicts the whole PID and Figure 3 its high income portion. The higher incomes are well approximated by a power low with an exponent of -3.07. (The difference of ~1.0 from the exponent for the BLS PDF (-4.1) is completely explained by the normalization to the total personal income reported by the BLS. It means that both exponents are identical.) It is likely that the same power law is valid at incomes higher than \$10,000,000. Hence, there is no significant deviation (except measurement errors) from the Pareto distribution even at very high incomes and our extrapolation of the BLS incomes along the power law is valid for the calculations of Gini coefficients.

Conclusion: there is no growth in income inequality. Krugman et al. definitely exaggerate. As a Russian physicist, I have no political or any other emotional prejudice to the income distribution in the USA. I just calculate it.

Figure 1. The population density function, PDF, as a function of mean income as normalized to the total personal income for a given year. At higher incomes, the curves are practically identical.

Figure 2. Population density function reported by the IRS.

Figure 3. Population density function reported by the IRS for high incomes. The Pareto distribution is obvious. Fluctuations are likely related to measurement error.

Disclosure: I have no positions in any stocks mentioned, and no plans to initiate any positions within the next 72 hours.

Tags: economy
Feb 06 10:33 AM | Link | Comment!
• Unemployment In Spain Will Be Increasing
Here we revisit the rate of unemployment, ut, in Spain using its dependence on the change in labor force, lt=dLF/LFdt. There is a new estimate of 22.8% for the unemployment rate in 2011. In May 2011, we quantitatively predicted that this rate should only be growing. It may reach 29% if the link between the rate of unemployment and the rate of labor force change is correct, as has been observed since 1980.

Previously, it was found that Spain is characterized by the same relationship between unemployment and labor force as other developed countries. For Spain, we used data provided by the OECD. Figure 1 depicts unemployment and the change rate of labor force between 1960 and 2011. In line with the OECD description of the breaks in the labor force series:

Series breaks: In 2005, changes in the questionnaire and the implementation of CATI system in the field work affected the estimates. The 2005 questionnaire produced an additional increase of employment (132 000) and a decrease of unemployment (78 000). From 2001, the new unemployment definition established by the European Commission in 2000 has been introduced. From 1994, persons employed in the "Guardia Civil" are not included in the armed forces. As an indication, this category represented 59 600 people in 1994. In 1976, the lower age limit for inclusion in the Labour Force Survey was raised from 14 to 16, at the same time other modifications to the survey were introduced.

there are two spikes in the dLF/LF series near 1976 and 2001 as related to step revisions to the level. The spike around 1988 has no explanation in terms of the revisions to labor force, but is of the same amplitude. One can not exclude the opportunity that this spike is related to the processes of joining the EU in 1986.

As expected, the same functional form of dependence is valid for Spain. The estimation method is based on trial-and-error approach and seeks for the fit between annual curves. The final model is as follows

ut = -7.0lt + 0.31; t>1986

Figure 2 depicts observed and predicted curves. Before 1986, the curves diverge and a different model is likely holds. Because of high-amplitude oscillations in the original time series for the rate of labour force change, lt, we have to smooth it by MA(3). For the period after 1986, R2=0.7. Thus, the change in labor force has been driving the rate of unemployment in Spain. The negative coefficient implies that unemployment is Spain goes down when labor force starts to increase.

As has been predicted by our model, the rate of unemployment has increased in 2011. This is not the end of the sad story on unemployment in Spain. Figure 2 evidences that it will likely be growing further with the decreasing labor force.

Figure 1. Unemployment rate, u, and the rate of labor force change, l, in Spain according to the definition introduced by the OECD.

Figure 2. Prediction of inflation by labor force. Due to high variation in the estimates of labor force we have smoothed it with MA(3). For the observed and predicted curves, R2=0.7 for the period between 1986 and 2011.

Disclosure: I have no positions in any stocks mentioned, and no plans to initiate any positions within the next 72 hours.

Tags: economy
Jan 30 10:45 AM | Link | Comment!

More »