Another Statscan error, a big one

August 16, 2006 Marc Lee Leave a comment

On the front page of today’s Globe and Mail, it was reported that Statistics Canada’s estimates of the Consumer Price Index had been miscalculated by a weany one-tenth of a percentage point since 2001.

I know of a more pressing problem with Statscan data, and so do they: conventional surveys are vastly understating the incomes of the poorest Canadians, and as a consequence the extent of inequality in Canada. This one did not make the front page.

Two years ago, Statscan quietly released a study by Marc Frenette, David Green and Garnett Picot. The study compared standard survey data used to generate estimates of inequality with tax and census data that have more comprehensive coverage, especially at the top and bottom of the distribution. They conclude:

The tax data we use, points to much larger increases in market income inequality driven both by rises at the top of the distribution and very substantial falls at the bottom. According to the tax data, and in contrast to the survey data, significant increases in after-tax and transfer income inequality were witnessed in the 1990s. Moreover, the level of inequality is much higher in the tax data than the survey data in each year, due mainly to much lower earnings at the bottom of the distribution.

In essence, survey and tax data vary in their estimates of the extent to which inequality rose during the recovery of the 1990s. This variation was largely due to the fact that after-tax income at the bottom of the distribution fell substantially according to tax data, while in survey data it simply failed to increase at the same pace as at the top of the distribution.

Aside from differences in trends, the level of income received by families at the bottom of the income distribution is considerably higher in survey data than it is in tax data. We put forth two hypotheses that could explain the difference. First, survey respondents may be more likely to report certain (small) income amounts than tax filers. These small income amounts might be more prominent at the bottom of the distribution. However, the move from pure survey data (in SCF) towards partial tax data (in SLID) does not result in any substantial decline in income at the bottom of the distribution, suggesting this is not likely the primary explanation.

A second hypothesis relates to the possible under-coverage of low-income individuals in survey data. If the gap between survey and tax data is indeed caused by a difference in the way income is reported, then we would expect the gap to still be present even if somehow the coverage in survey data was as high as in tax data. Alternatively, if the difference in coverage is at the heart of this gap, then we would expect the gap to disappear if the coverage in survey data were to match that of tax data.

To shed some light on this issue, we compare the pre-tax income distribution from survey and tax data with that of Census data. The Census collects income data in much the same was as the SCF did, yet the coverage rate is much higher (as in the tax data).

We find that the bottom end of the income distribution in Census data more closely resembles the tax data than the survey data, both in terms of levels and trends. Income at the bottom end of the distribution is always higher in survey data than in Census and tax data. Furthermore, Census and tax data both point to a decline in income at the bottom of the distribution between 1995 and 2000. The bottom of the distribution in survey data shows a small improvement in income between 1995 and 2000. These findings are consistent with the notion that survey data may under-represent the bottom end of the income distribution, but further analysis would be required to reach a more definitive conclusion in this regard.

Based on census and tax data, there appears to have been higher levels of inequality and much stronger increases in market income inequality in the 1990s than has been previously acknowledged.

Full disclosure: Marc Frenette did runs for BC out of this dataset, which I used for a CCPA publication on BC income inequality.

To follow up on their initial findings, Frenette and Green, this time with Kevin Milligan, dug deeper and published their results earlier this year. I’ll just post the abstract, as their findings re-affirm the previous work:

We present new evidence on levels and trends in after-tax income inequality in Canada between 1980 and 2000. We argue that existing data sources may miss changes in the tails of the income distribution, and that much of the changes in the income distribution have been in the tails. Our data are constructed from Census files, which are augmented with predicted taxes based on information available from administrative tax data. After validating our approach in predicting taxes on the Census files, we document differences in the levels and trends in after-tax inequality between the newly constructed data source and the more commonly used survey data. We find that after-tax inequality levels are substantially higher based on the new data, primarily because income levels are lower at the bottom than in survey data. The new data show larger long-term increases in after-tax income inequality and far more variability over the economic cycle. This raises interesting questions about the role of the tax and transfer system in mitigating both trends and fluctuations in market income inequality.

Another Statscan error, a big one

Related Posts

Leave a Reply