COWLES FOUNDATION FOR RESEARCH IN
ECONOMICS Box 208281
COWLES FOUNDATION DISCUSSION PAPER NO. 1644 Semiparametric Efficiency in GMM Models of Nonclassical Measurement
Errors, Xiaohong Chen, Han Hong, and Alessandro Tarozzi March 2008 We study semiparametric efficiency bounds and efficient estimation of parameters
defined through general nonlinear, possibly non-smooth and over-identified moment
restrictions, where the sampling information consists of a primary sample and an auxiliary
sample. The variables of interest in the moment conditions are not directly observable in
the primary data set, but the primary data set contains proxy variables which are
correlated with the variables of interest. The auxiliary data set contains information
about the conditional distribution of the variables of interest given the proxy variables.
Identification is achieved by the assumption that this conditional distribution is the
same in both the primary and auxiliary data sets. We provide semiparametric efficiency
bounds for both the "verify-out-of-sample" case, where the two samples are
independent, and the "verify-in-sample" case, where the auxiliary sample is a
subset of the primary sample; and the bounds are derived when the propensity score is
unknown, or known, or belongs to a correctly specified parametric family. These efficiency
variance bounds indicate that the propensity score is ancillary for the
"verify-in-sample" case, but is not ancillary for the
"verify-out-of-sample" case. We show that sieve conditional expectation
projection based GMM estimators achieve the semiparametric efficiency bounds for all the
above mentioned cases, and establish their asymptotic efficiency under mild regularity
conditions. Although inverse probability weighting based GMM estimators are also shown to
be semiparametrically efficient, they need stronger regularity conditions and clever
combinations of nonparametric and parametric estimates of the propensity score to achieve
the efficiency bounds for various cases. Our results contribute to the literature on
non-classical measurement error models, missing data and treatment effects. |