Information for Researchers >>

Census geography: Pooling adjacent tracts to improve reliability of estimates

Beginning in December 2008, three-year estimates for geographic units with populations of 20,000 or more have been released annually. At the end of 2010, the ACS began to release five-year estimates for all geographic units. Pooling five years of data was initially expected to yield tract estimates whose reliability is comparable to the sample count estimates from 2000. It now seems likely that tract estimates will not achieve this level of reliability, and researchers who wish to use tract data will be interested in examining ways to pool data over multiple tracts. Beginning 2006-2010 the population counts from Census 2010 have been used to control the five-year estimates on population counts from Census 2010, which removed one source of error in ACS data.

  • Andrew Beveridge (Queens College, CUNY) has expressed concerns with the methods used by the Census Bureau to estimate the Margin of Error (MOE). His memorandum and the response prepared by the Census Bureau clarify some of the issues that are involved. These are posted here:
  • David Wong (George Mason University) has developed some tools under contract with the Census Bureau to take account of the MOE problem in spatial analysis and mapping. These tools and further explanation are available here:

One approach to the MOE problem is to pool together data from adjacent tracts. For example, in one Census 2010 report, John Logan identified the “neighborhoods” that people lived in to include their census tract and each of the surrounding tracts (“Separate and Unequal” ). These neighborhoods, of course, are overlapping. We have now developed data files that identify the adjacent tract for every census tract in the nation in 1990, 2000, 2005-2009, 2010, and 2020. (Although the 2005-2009 ACS used 2000 tract IDs in most cases, there are cases where the 2010 ID was used in error.) Researchers can use these files to aggregate census data for larger neighborhood areas. The adjacency files can also be used to create “spatial lag” variables – variables that describe the context in which a tract is embedded. Such variables are increasingly used in studies of neighborhood effects.

Download the files necessary to identify, for every census tract, its adjacent tracts. This is equivalent to a weights matrix (queen’s) in GIS terms.

For each year we provide two files, which together give users the ability to determine all of the tracts that are adjacent to a given tract. This capability would be especially useful for spatial analyses that require a weights matrix, such as spatial regression.

The file named tract_nnnn (where nnnn is the year) lists the key geographic identifiers for tracts: these include a state, county and tract number, a tractid that concatenates these into a single number, and a tract id (GISJOIN) that treats this number as a string variable. It also gives every tract a simple sequential id number that can be used to identify it and its adjacent tracts. In the file named nlist_nnnn every focal tract is listed in one column with one identifier (FID), and each of its neighboring tracts (and itself) is listed as a separate case with another identifier (NID).

Select the year(s):  1990
 2005-09 ACS
Select the format:  SPSS/PC+ Version 19
 Comma Delimited
 SAS V9+ Windows
 STATA Version 8 SE