Conduct a sensitivity analysis for estimated ecological quantities

Relates confounding of an omitted variable with predictor or outcome to bias in ecological estimates, using the nonparametric sensitivity analysis of Chernozhukov et al. (2024).

Usage

ei_sens(
  est,
  c_outcome = seq(0, 1, 0.01)^2,
  c_predictor = seq(0, 1, 0.01)^2,
  bias_bound = NULL,
  confounding = 1,
  expand_ci = TRUE
)

Arguments

est: A set of estimates from ei_est() using both regression and Riesz representer.
c_outcome: The (nonparametric) partial $R^2$ of the omitted variables with the outcome variables. Must be between 0 and 1. Can be a vector, in which case all combinations of values with c_predictor are used.
c_predictor: How much variation latent variables create in the Riesz representer, i.e. $1-R^2$ of the true Riesz representer on the estimated one without the omitted variable. Must be between 0 and 1. Can be a vector, in which case all combinations of values with c_outcome are used.
bias_bound: If provided, overrides c_predictor and finds values of c_predictor that correspond to (the absolute value of) the provided amount of bias.
confounding: The confounding parameter ($\rho$), which must be between 0 and 1 (the adversarial worst-case).
expand_ci: If TRUE and confidence intervals are present in est, expand the width of the intervals in each direction by the calculated bias bound.

Value

A data frame of the same format as est, but with additional columns: c_outcome and c_predictor, matching all combinations of those arguments, and bias_bound, containing the bound on the amount of bias. The data frame has additional class ei_sens, which supports a plot.ei_sens() method.

Details

The parameter c_predictor equals $1 - R^2_{\alpha\sim\alpha_s}$, where $\alpha$ is the true Riesz representer and $\alpha_s$ is the Riesz representer with the observed covariates. The RR can be equivalently expressed as $$ \alpha = \partial_{\bar x_j} \log f(\bar x_j\mid z, u), $$ where $U$ is the unobserved confounder and $f$ is the conditional density. The corresponding c_predictor is then $$ 1 - R^2_{\alpha\sim\alpha_s} = 1 - \ \frac{\mathbb{E}[(\partial_{\bar x_j} \log f(\bar x_j\mid z))^2]}{ \mathbb{E}[(\partial_{\bar x_j} \log f(\bar x_j\mid z, u))^2]}. $$

The bounds here are plug-in estimates and do not incorporate sampling uncertainty. As such, they may fail to cover the true value in finite samples, even under large enough sensitivity parameters; see Section 5 of Chernozhukov et al. (2024).

References

Chernozhukov, V., Cinelli, C., Newey, W., Sharma, A., & Syrgkanis, V. (2024). Long story short: Omitted variable bias in causal machine learning (No. w30302). National Bureau of Economic Research.

Examples

data(elec_1968)

spec = ei_spec(elec_1968, vap_white:vap_other, pres_ind_wal,
               total = pres_total, covariates = c(state, pop_urban, farm))
m = ei_ridge(spec)
rr = ei_riesz(spec, penalty = m$penalty)
est = ei_est(m, rr, spec)

ei_sens(est, c_outcome=0.2)
#> # A tibble: 303 × 7
#>    predictor outcome      estimate std.error c_outcome c_predictor bias_bound
#>    <chr>     <chr>           <dbl>     <dbl>     <dbl>       <dbl>      <dbl>
#>  1 vap_white pres_ind_wal    0.387    0.0246       0.2      0         0      
#>  2 vap_black pres_ind_wal    0.490    0.0505       0.2      0         0      
#>  3 vap_other pres_ind_wal   -1.19     0.528        0.2      0         0      
#>  4 vap_white pres_ind_wal    0.387    0.0246       0.2      0.0001    0.00111
#>  5 vap_black pres_ind_wal    0.490    0.0505       0.2      0.0001    0.00392
#>  6 vap_other pres_ind_wal   -1.19     0.528        0.2      0.0001    0.0909 
#>  7 vap_white pres_ind_wal    0.387    0.0246       0.2      0.0004    0.00221
#>  8 vap_black pres_ind_wal    0.490    0.0505       0.2      0.0004    0.00784
#>  9 vap_other pres_ind_wal   -1.19     0.528        0.2      0.0004    0.182  
#> 10 vap_white pres_ind_wal    0.387    0.0246       0.2      0.0009    0.00332
#> # ℹ 293 more rows

# How much variation would the regression residual need to explain of
# Riesz representer to cause bias of 0.4?
ei_sens(est, c_outcome=1, bias_bound=0.4)
#> # A tibble: 3 × 7
#>   predictor outcome      estimate std.error c_outcome c_predictor bias_bound
#>   <chr>     <chr>           <dbl>     <dbl>     <dbl>       <dbl>      <dbl>
#> 1 vap_white pres_ind_wal    0.387    0.0246         1    0.723           0.4
#> 2 vap_black pres_ind_wal    0.490    0.0505         1    0.172           0.4
#> 3 vap_other pres_ind_wal   -1.19     0.528          1    0.000388        0.4

# Update confidence intervals and extract as matrix
est = ei_est(m, rr, spec, conf_level=0.95)
sens = ei_sens(est, c_outcome=0.5, c_predictor=0.2)
as.matrix(sens, "conf.high")
#>            outcome
#> predictor   pres_ind_wal
#>   vap_white    0.5228791
#>   vap_black    0.8984785
#>   vap_other    7.0309701