library(infer)
set.seed(02138)
Problem Set 7: Exposure to Inequality and Support for Redistribution
You can find instructions for obtaining and submitting problem sets here.
You can find the GitHub Classroom link to download the template repository on the Ed Board
Background
Does exposure to inequality affect our support for redistributive policies such as taxes on higher income earners? A recent paper explored the effect of brief exposure to socioeconomic inequality in an everyday setting on support for a millionaire’s tax. This exercise is based on:
Sands, Melissa L. 2017. “Exposure to inequality affects redistribution.” Proceedings of the National Academy of Sciences, 114(4): 663-668.
In this experiment, the author hired actors to stand in affluent, predominantly white, commercial areas around Boston, MA that have high pedestrian traffic. These actors were either white or Black, and each actor dressed in attire that indicated either affluence (well-dressed, well-groomed) or poverty (unkempt, shabby clothing). The author randomly assigned shifts to each actor with randomly chosen attire to stand on a city street within 20 feet of a petitioner hired by the researcher. This petitioner would stop every third adult that walked past the actor and ask them to sign a petition for the millionaire’s tax (a measure in MA to impose an additional tax of 4% on individuals with annual incomes of $1 million or more) or to sign a petition about reducing the use of plastic bags in local stores. The type of petition was randomly assigned as well. The outcome of interest is whether the respondent agreed to sign the petition on the millionaire’s tax (the plastic bag petition is used as a placebo).
A total of 2,591 respondents were petitioned with 1,335 being petitioned about the millionaire’s tax. Petitioners also collected their “best guess” about the gender, age, and race/ethnicity of each person approached. The data file for this study is inequality-exposure.csv
and contains the following variables:
Name | Description |
---|---|
signed |
1 if the respondent signed the petition, 0 otherwise |
mill_tax |
1 if petitioned about the millionaire’s tax, 0 for plastic bag petition. |
blackactor |
1 if actor was Black for this respondent, 0 for white |
pooractor |
1 if actor was in poverty condition, 0 for affluent condition |
black |
1 if petitioner guessed respondent was Black |
white |
1 if petitioner guessed respondent was non-Hispanic white |
asian |
1 if petitioner guessed respondent was Asian |
hisp |
1 if petitioner guessed respondent was Hispanic |
young |
1 if petitioner guessed respondent was 18-35 years old |
middle |
1 if petitioner guessed respondent was 36-65 years old |
old |
1 if petitioner guessed respondent was >65 years old |
female |
1 if petitioner guessed respondent was female |
clust |
Cluster number of respondent (see question 6) |
Question 1 (5 points)
Load the data into R and name it ineq
. Create a tibble called mill_df
that is filtered to respondents petitioned about the millionaire’s tax. We will use this data throughout the exercise. Create two new variables:
costume
that is"Poor"
whenpooractor
is 1 and"Affluent"
otherwiserace_actor
that is"Black"
whenblackactor
is 1 and"White"
otherwise
Calculate the following object, saving it with the names indicated:
ineq_diff
: The difference in means in petition signing (signed
) between seeing the actor in the poor and affluent conditions (costume
) for those who were petitioned about the millionaire’s tax. This should be a 1x1 tibble.
Report this values in the text of your write up and briefly interpret it.
Rubric: 1pt for the Rmd file rendering (autograder); 1pt for correct mill_df
tibble (autograder); 2pts for correct ineq_diff
tibble (autograder); 1pt write-up and interpretation (PDF).
Question 2 (10 Points)
In the first line of the code chunk for this question use the following code:
Generate 1,000 bootstrap replications of the estimated ATE from Question 1 and save these bootstraps in a tibble called ate_boots
. You may use the rep_slice_sample()
or specify/generate
approach, but the column of bootstrapped ATEs should either be called ATE
or stat
.
Use these bootstraps to calculate a 95% confidence interval for the difference in means using the percentile method and save this as ate_ci_95
, which should be a 1 by 2 tibble.
Use ggplot()
, geom_histogram()
to plot the bootstrap distribution and overlay it with the confidence interval using this geom:
geom_vline(xintercept = unlist(ate_ci_95))
This will be manually graded in the PDF so be sure it shows up in the PDF. Use informative labels.
In the writeup, discuss if the CI contains zero? What does that mean?
Rubric: 4pts for correct ate_boots
tibble (autograder); 2pts for correct ate_ci_95
(autograder); 3pts for plot of the bootstrap distribution and CI (PDF); 1pt for write up (PDF).
Question 3 (4 points)
Explain how to interpret 95% confidence intervals in terms of repeated sampling. Is it possible to produce a 100% confidence interval in this setting? If so, what is it and is it useful?
Rubric: 3pts for interpretation of CIs (PDF); 1pt for discussion of 100% confidence interval (PDF).
Question 4 (5 points)
Calculate the ATE for Black and White actors separately (using the race_actor
variable) and calculate the interaction or difference between these two ATEs. The output should be a 1 row tibble named ate_race
with columns ATE_Black
, ATE_White
, and ATE_Diff
that are the ATE for Black actors, White actors and the difference between them, respectively.
In the write-up, report the interaction and describe what it means in the substance of this experiment.
Rubric: 3pts for correct ate_race
tibble (autograder); 2pt for reporting the effect and interpretation (PDF).
Question 5 (11 points)
In the first line of the code chunk for this question use the following code:
set.seed(02138)
Use rep_slice_sample
(not specify/generate
) to generate 1000 boostrap replications of the difference in ATEs between Black and White actors from Question 4. Save this tibble as ate_race_boots
.
Then construct a 95% confidence interval for the difference between the ATE for black Actors and the ATE for White actors and save this confidence intervals as ate_race_ci_95
.
Use ggplot()
, geom_histogram()
to plot the bootstrap distribution and overlay it with the confidence interval using this geom:
geom_vline(xintercept = unlist(ate_race_ci_95))
This will be manually graded in the PDF so be sure it shows up in the PDF. Use informative labels.
In the writeup, discuss if the CI contains zero? What does that mean?
Rubric: 5pts for correct ate_race_boots
tibble (autograder); 2pts for correct ate_race_ci_95
(autograder); 3pts for plot of the bootstrap distribution and CI (PDF); 1pt for write up (PDF).