::modelsummary(
modelsummary
fit,statistic = c("s.e. = {std.error}",
"p = {p.value}"),
gof_map = c("nobs", "r.squared", "adj.r.squared")
)
Practice Exam 2: Windpower and NIMBYism
You can find instructions for obtaining and submitting problem sets and exams here.
You can find the GitHub Classroom link to download the template repository on the Ed Board
Background
This practice exam is based on the following article:
Stokes, Leah C. (2016). “Electoral Backlash against Climate Policy: A Natural Experiment on Retrospective Voting and Local Resistance to Public Policy Authors” American Journal of Political Science, Vol. 60, No. 4, pp. 958-974.
The paper explores the period from 2003 to 2011 in Ontario, Canada. During that time the Liberal Party government passed the Green Energy Act. This policy allowed groups (corporations, communities, and even individuals) to build wind turbines and other renewable energy projects throughout the province. Further, the government agreed to sign long-term contracts to buy the energy produced by these projects.
Although opinion polls suggest that there was broad support for green energy projects, many voters appeared angry about having a windfarm near them. That is, many people wanted windfarms, but just not near them. This is sometimes called NIMBYism (Not In My BackYard). Stokes’s paper investigated whether people near windfarms were more likely to vote against the Liberal Party, which put in place a policy that promoted windfarms.
Here’s a partial codebook for the variables in wind.csv
:
Name | Description |
---|---|
master_id |
Precinct ID number |
year |
Election year |
prop |
Binary variable indicating whether there was a proposed turbine in that precinct in that year |
op |
Binary variable indicating if there is an operational turbine in that precinct in that year |
perc_lib |
Votes cast for Liberal Party divided by the number of voters who cast ballots in precinct |
log_pop |
Log of the population of the precinct |
log_pop_denc |
Log of the population density (population per square mile) of the precinct |
avg_home_val_log |
Log of the average home value in the precinct |
Because windfarms were only created in rural parts of Ontario, we are going to restrict the analysis to rural areas; see paper for definition. Further, we are only going to analyze a random sample of 500 rural precincts for computational reasons.
Finally, the author assumes that the location of the windfarms was as-if random. In other words, just like people in a clinical trial are randomly assigned to receive treatment or control, in this case it was as if the windfarms were assigned to precincts without regard to the political attitudes of residents. Under this assumption that windfarm location was unrelated to political preferences, we can give this regression a causal interpretation.
Instructions
This exam is open-book, open-note, and open-internet. However, you are forbidden from communicating with other humans about the exam. This includes, but is not limited to: exchanging texts/emails/chats/DMs etc about the exam; sharing notes about the exam or the course; posting material on the internet about the exam; asking for help with a question on the exam from online forums; requesting that someone produce materials that could be helpful for the exam. Basically, use your common sense and complete this assignment on your own.
You may submit to the autograder as many times as you would like, just like a normal problem set.
Please write any written, non-code responses in the main text and not in the R code chunks. Also, do not include hastags
#
in the main text except on the lines indicating## Question
or## Answer
.Please check your final PDF before uploading it and ensure that your written answers and plots are visible and correctly reflect your final answers.
If you have a clarification question or you think there is an issue with the autograder, please email myself and your TF with your question. We will either say that we cannot answer the question or we will send a note to the entire class with the answer.
Please do not post anything about the exam on Slack or Ed.
Question 1
Use read_csv
to load the data and assign it to the name wind
. In the write-up, answer the following questions. What years are included? How many precincts are included? How many year-precincts are included?
Question 2
Make a boxplot that shows the distribution of vote share for the Liberal Party in each year. What do you conclude about how the Liberal Party is doing over time from this plot?
Question 3
Now we are going to explore whether districts that had proposed wind turbines decreased their support for the Liberal Party. First, create a new factor variable called proposed
that takes the values:
"Proposed"
ifprop == 1
and"Not Proposed"
ifprop == 0
Calculate the difference in mean percentage vote for the Liberal Party between precincts where wind turbines were and were not proposed. Save this value as ate
.
In the write-up, report the estimated difference in means and provide a substantive interpretation of the coefficient.
Question 4
NOTE: For this problem, set the seed to 02138 at the top of the chunk and use reps = 1000
when generating the permutations.
Use the infer
framework to conduct a two-sided permutation hypothesis test of whether the proposed wind turbine precincts have the same mean of perc_lib
as precincts without wind farm proposals. Calculate the two-sided p-value using get_p_value()
and save it as ate_p
.
In the write-up, report the p-value associated with this test and interpret it in this substantive context. Based on the output, can you reject this null hypothesis if \(\alpha = 0.05\)?
Question 5
Suppose now we wanted to make sure that our estimated ATE was not due to confounding factors. Run a regression of percentage support for the Liberal Party on prop
and the following control variables: a set of dummy variables for the year (hint: you may have to use the factor()
function to convert year
from numeric to factor), the log of population, the log of population density, and the log of the average home value in the precinct. Save this multiple regression as fit
.
In the write-up, report the estimate coefficient on prop
and interpret it in this substantive context. Does controlling for these factors change the estimates in meaningful way?
Question 6
Use the tidy()
function from the broom
package to obtain a regression table for the fit
output and save it as fit_table
. In the write up, use the following code to display the estimated coefficients, standard errors, and p-values:
Answer the following questions in the write up:
- What is the standard error on the estimated coefficient for proposed wind farms? Describe what variability this standard error measures.
- A p-value for this coefficient is provided in the output. Describe what the null hypothesis is for the associated test in this substantive context.
- Report the p-value for
prop
in the text. Can we reject the null with an \(\alpha = 0.05\)? What about \(\alpha = 0.1\)?
Extra credit: use the coef_map
argument to modelsummary()
to produce a regression table with nicely formatted variable names.