Cross, Philip: Regressions, Short and Long
World Conference Econometric Society, 2000, Seattle

Philip Cross, Georgetown University
Charles F. Manski, Northwestern University
Regressions, Short and Long
Session: C-8-17  Monday 14 August 2000  by Cross, Philip
We study the problem of identification of the long regression E(y|x,z) when the short conditional distributions P(y|x) and P(z|x) are known but the long conditional distribution P(y|x,z) is not known. This problem often arises when a researcher utilizes data from two separate data sets. (A leading example is the ecological inference problem of political science, where voting behavior across electoral districts is observed from administrative records, the demographic composition of voters within a district is observed from census data, and the researcher wants to infer voting behavior conditional on district and demographic attributes.) We isolate an identification region containing feasible values of the long regression, and show that this region forms a sharp bound on the long regression. The identification region can be calculated precisely when y has finite support. When y has infinite support we characterize two sets, one that contains the identification region, and one that is contained by it. Following this completely nonparametric analysis, we examine the identifying power yielded by exclusion restrictions across distinct covariate values. Such restrictions cause the identification region to shrink, in many cases to a single point. To illustrate the theory, we pose and address this hypothetical question: What would be the outcome if the 1996 U.S. presidential election were re-enacted in a population of different demographic composition, ceteris paribus?
Submitted paper full-text in .pdf


File created by Jurgen Doornik with eswc2000.ox on 2-01-2001