The census and inequality

A few years ago an important study, by Marc Frenette, David Green and Kevin Milligan, Revisiting Recent Trends in Canadian After-Tax Income Inequality Using Census Data, was published by Statscan. It did not get much profile but its implications for the current census debacle are startling. The authors summarize:

… [E]xisting data sources may miss changes in the tails of the income distribution, and that much of the changes in the income distribution have been in the tails. Our data are constructed from Census files, which are augmented with predicted taxes based on information available from administrative tax data. … . We find that after-tax inequality levels are substantially higher based on the new data, primarily because income levels are lower at the bottom than in survey data. The new data show larger long-term increases in after-tax income inequality and far more variability over the economic cycle. This raises interesting questions about the role of the tax and transfer system in mitigating both trends and fluctuations in market income inequality.

For example, in 2000, the Survey of Labour and Income Dynamics reported average income in the bottom decile of just under $8,000 (adult equivalent dollars), whereas the Census found incomes of $6,000 – 25% less than what one would find from the SLID. Average income in the top decile is also higher in the Census than the SLID, meaning “the ratio of the average income in the top decile to the average in the bottom decile was 16.4 in 2000 according to the Census but only 11.7 according to SCF/SLID.”

Why is the census an essential resource? It is worth quoting FGM at length:

Census data, which is the source upon which we focus in this paper, has several appealing characteristics. First, it has no breaks over our period of interest. In contrast, the SCF was replaced by SLID in 1996. Although this did not affect average levels of income, it did affect incomes at the top and bottom of the distribution, requiring some kind of adjustment at the time of the “seam”. …

A second appealing feature of the Census is its coverage. Response to the Census is mandatory by law, and as such, coverage of the population is almost complete, with the exception of very specific groups (most notably, on-reserve Aboriginals, individuals in collective dwellings, and the homeless). Response to the SCF/SLID is voluntary, and roughly 20% of selected households choose not to do so. This creates the potential for response bias that may be related to income. The SCF/SLID datasets include weights calculated so that key sample characteristics mimic those of the population as a whole, but income is not one of the characteristics. Thus, to the extent that response bias is related to income, even after controlling for observables that are directly addressed by the weights, the weighted income distribution obtained from the SCF/SLID may still not correspond to that for the whole population. The population coverage on T1FF is quite good, but only after 1993 when the combination of incentives from child tax credits and goods and services tax (GST) rebates improved the filing incentives for very low income individuals.

A third feature of the Census (like T1FF) is its very large sample size (20% of the population), allowing researchers to conduct more detailed analyses of income inequality. In particular, large samples are important for obtaining reliable measures of movements in extreme percentiles of the distribution. In contrast to the Census, SCF/SLID has approximately 30,000 to 35,000 observations, making both detailed decompositions and examinations of extreme tails of the income distribution more problematic.

A fourth advantage of the Census is that it contains detailed socio-economic information on its respondents. This is also true in the SCF/SLID, but not in tax data. In particular, education is missing from the tax files.

4 comments

  • Armine Yalnizyan

    Marc you beat me to the punch!
    Was going to file a post today on this very subject!
    May well do later, when thing settle.
    Thanks for reminding us of this.

  • There is a lot over under coverage with the tax data for specific groups, seniors and low income. Also, there are issues with the economic and census families that need a whole lot more study when using the tax data

    SCF/SLID was designed for a whole different purpose than providing cross sectional snapshots of income. It also rides on the back of the LFS survey program, and therefore does not have even close to the power of the census long form. The 35,000 or so sample size is minuscule comnpared to the census, and can barely handle co-efficient of Variations (CV) of 30% at the CMA level.

    As indicated by myself the other day on ths blog, and also stated by Ivan Fellegi over the weekend, the census is integrated within the core of the LFS and many other surveys to allocate and benchmark the LFS sampling plan. For example, there is no other source of occupational data as the census. The point being, without the census to help design and maintain the frame of the LFS, we open the window to a whole lot less statistical reliability.

    As stated above, I would say you could most likely see the LFS and other surveys now having CVs of above 30% at some important geographic levels such as some CMAs.

    So any notion of using such vehicle to measure income will also be lost for any survey making use of the LFS, for example the GSS.

    sadly I could write a quite long book on why this is has many negative ramification on the data that obviously form the outcry, many people use in their work and public life.

    We are seemingly up against a Tea party like defence of why we should scrap the mandatory census long form. So we can argue until we are blue in he face, and I am not sure any of these points will even be acknowledged by the other side. I am just hoping the Tim Horton’s crowd hates tea.

  • Thanks for the nice write up of our paper. A version of this ended up in the CJE in 2007. I should note that our work here was an application and extension of earlier more fundamental comparative work on the different data by Frenette Green and Garnett Picot. http://www.statcan.gc.ca/pub/11f0019m/11f0019m2004219-eng.pdf

    Regards–and thanks to people around here (Armine I’m looking at you) for their big efforts in building coalitions on this issue.

  • Concerned Economist

    There is a parliamentary committee on the Census next week. The Liberals or NDP need this information. It might force them to call in witnesses.

    Anybody have any connections?

Leave a Reply

Your email address will not be published. Required fields are marked *