Urban Transition Project

Mapping without ED Descriptions

In a number of cities there were no surviving records of ED descriptions. In some other cases the existing ED descriptions made use of political boundaries that could not be ascertained. Hence additional efforts were required to infer the ED boundaries for these cities: Allegheny, Cincinnati, Cleveland, Columbus, Denver, Hartford, Jersey City, Milwaukee, Mobile, New Haven, Newark, Oakland, Pittsburgh, Richmond, and San Francisco.

Creating enumeration districts in cities with missing descriptions involves an inductive process similar to a logic puzzle where multiple sources of imperfect information can be brought together to suggest a solution.

1. Civil Divisions. The Census Office instructed District Supervisors to create EDs within wards and other minor political units. Therefore as a first step we assumed that ward boundaries should be one layer of our ED map.

2. Intersecting Streets. From the transcription of street addresses created by the Minnesota Population Center and the microdata file from NAPP, we could determine for any given street which EDs its residents should be found in. If two streets intersected, there would be in principle only one ED that they had in common, and on that basis we could create a map of intersections and their corresponding ED number. We illustrate the result for Cleveland. The colored polygons represent our final determination of the areas for each ED. The colored dots are intersections, and their colors represent the ED that they “should” have been located in. Note that the initial locations include many errors. Area A shown on the map is a zone where the street density and transcription accuracy is high enough that EDs are readily apparent. Area B is a zone where intersections turn out to be a poor guide to the ED boundaries, and more information is needed to find their limits.

Comparison of the initial mapping of intersections
with the final boundaries for Clevelan

3. Geocoding Addresses. ED descriptions and intersections provided a good approximation of ED boundaries for all cities. However a final source of information is from the geocoded addresses of residents, and these were used in every city as an additional test of accuracy. Geocoding in the sense used here refers to the automated procedure of determining a spatial point location for a street address using GIS software. But this software requires a data file that identifies for every street segment the range of addresses found between its two ends, and also which are the odd-numbered and even-numbered sides of the street.

Our edited contemporary TIGER file provided the locations of street segments. City directories are the primary resource concerning historical street names, direction, address ranges, and intersecting streets. Unfortunately the city directories for several cities offered no information on addressing: Allegheny, Columbus, Denver, Detroit, Kansas City, Louisville, Minneapolis, Mobile, Nashville, Oakland, Omaha, and Pittsburgh. In some other cities only the address ranges for major streets were listed: Atlanta, Charleston, Cincinnati, Memphis, Milwaukee, New Orleans, and St. Paul.

In the worst case, if we assumed that the initial ED boundaries are correct, we could align all residents along a street by their house number and discover the address range for that street within the ED. Their actual location along the street could be based on linear interpolation. In some cities it is hard to improve on this approach, because the addressing system gives a relative location in respect to other addresses on that street but it does not indicate an absolute location. This is most likely in places where blocks are of irregular length and streets tend to follow the contours of the landscape, as is often true on the outskirts of a city.

The system dominant in the Midwest and West has house numbers assigned through reference to a central point in the city. This Cartesian (or Philadelphia) system typically has a single point of origin for all addresses demarcated by the crossing of two baseline streets, with breaks at a regular distance, most often the distance of one city block – hence the 100 block, 200 block, etc. In such cities geocoding can be done with a high degree of accuracy even where historical city directories do not provide address ranges.

In practice, there are multiple sources of error for all cities, from the ambiguities in ED descriptions to mistakes by enumerators to errors in transcription. Our approach in every city was to make multiple iterations: estimate ED boundaries, map residents using geocoded addresses, then correct obvious boundary errors and redo the geocoding of residents. The ED boundaries mapped as a result of this process have a high degree of accuracy.