# Problem Set 8: The Effect of Cognitive Behavioral Therapy on Crime and Violence

You can find instructions for obtaining and submitting problem sets here.

You can find the GitHub Classroom link to download the template repository on the Ed Board

## Background

Can noncognitive skills and preferences like patience and identity be changed in adults? And can improving these skills lower crime and violence? A recent paper attempted to answer these questions with a randomized field experiment in Liberia. This exercise is based on:

Blattman, Christopher, Julian C. Jamison, and Margaret Sheridan. 2017. “Reducing Crime and Violence: Experimental Evidence from Cognitive Behavioral Therapy in Liberia.”

American Economic Review, 107(4): 1165-1206.

In this experiment, the researchers randomly assigned two different treatments to all respondents. Half of respondents were randomly assigned to eight weeks of cognitive behavioral therapy designed to reduce self-destructive beliefs or behaviors and promote positive ones. The sessions focused on encouraging goal-setting and self-control with simple behaviors and then built up to dealing with more realistic situations while learning to control emotions. The was to promote noncognitive skills that might reduce tendencies toward criminal activity or violence. Note that not everyone assigned to treatment actually attended all of the meetings. The second treatment was a randomly assigned cash prize of $200. The outcomes of interest are index measures of anti-social behavior like theft, carrying weapons, or selling drugs. An index combines all of these different variables into one continuous measure. The researchers also have baseline measures of some of these variables.

A total of 999 respondents were randomized in the study. The data file for this study is `data/cbt.csv`

and contains the following variables:

Name | Description |
---|---|

`partid` |
participant ID number |

`cashassigned` |
participant assigned to cash treatment (1) or not (0) |

`tpassigned` |
participant assigned to CBT treatment (1) or not (0) |

`attend_80` |
did the participant attend 80% CBT meetings (1) or not (0) |

`steals_basline` |
has the participant committed theft in the last 2 weeks before treatment |

`homeless_basline` |
is the participant sleeping on the streets before treatment |

`year_born` |
year of birth of participant |

`fam_asb_st` |
index score of anti-social behaviors 2 weeks posttreatment |

`carryweapon` |
does participant carry a weapon, 2 weeks posttreatment |

`fam_asb_lt` |
index score of anti-social behaviors 12 months posttreatment |

## Question 1 (9 points)

Experiments are designed so that treatment assignment is unrelated to any background characteristics of the participants. But treatment and control groups will differ a bit by random chance even if they are the same on average. Let’s see if the randomization created groups that are fairly balanced by doing a *balance test*.

Load the `tidyverse`

and `infer`

packages and load the data into R and name it `cbt`

. Create two new variables in the `cbt`

tibble:

`cbt_assigned`

that is`"CBT"`

if`tpassigned`

is 1 and`"No CBT"`

otherwise.`unhoused_baseline`

that is`"Unhoused"`

if`homeless_baseline`

is 1 and`"Housed"`

otherwise.

Calculate the difference in proportions in the baseline measure of sleeping on the streets (`unhoused_baseline`

) between the CBT treatment group and the non-CBT group (`cbt_assigned`

). Save this value as `base_diff`

. Next, use the `infer`

framework to conduct a two-sided permutation hypothesis test of whether the CBT treatment group has the same proportion of `unhoused_baseline`

as the non-CBT group. Calculate the two-sided p-value using `get_p_value()`

and save it as `base_p`

.

Report the observed difference and the p-value in the write-up. With an \(\alpha\) of 0.05, can you reject the null hypothesis that the proportion sleeping on the streets is the same in the two groups? Given that decision, which type of error (type I vs type II) would be worried for this test?

**NOTE:** For this problem, set the seed to 02138 at the top of the chunk and use `reps = 1000`

when generating the permutations.

**Rubric**: 1pt for successfully compiling Rmd (autograder); 1pt for loading the data and creating the new variables (autograder); 2pts for the `base_diff`

(autograder); 3pts for the `base_p`

(autograder); 1pt for reporting the difference/p-value and the rejection or not decision (PDF); 1pts for correct error type (PDF).

## Question 2 (8 Points)

A colleague suggests that you should use actual attendance at the CBT meetings rather than the random assignment since there shouldn’t be any treatment effect for those that don’t attend the meetings. To investigate this, you decide to do a balance test on the attendance variable that measures if a person actually attended 80% of the CBT meetings (`attend_80`

). Create a new variable:

`cbt_attended`

that is`"Attended CBT"`

if`attend_80`

is 1 and`"Not Attended"`

otherwise.

Calculate the difference in proportions in baseline sleeping on the streets (`unhoused_baseline`

) for those that actually attended 80% of meetings versus those that did not (`cbt_attended`

) and save this difference as `base_diff_attend`

. Calculate the two-sided p-value for another permutation hypothesis for the null hypothesis that the true proportion of baseline sleeping on the street is the same across values of `attend_80`

. Save this p-value as `base_p_attend`

.

Report the observed difference and the p-value in the write-up. Is any imbalance that you find on this variable statistically significant (that is, can you reject the null hypothesis) if \(\alpha\) is 0.05? Which type of error are we worried about here? What parameter of the hypothesis test can control this type of error?

**NOTE:** For this problem, set the seed to 02138 at the top of the chunk and use `reps = 1000`

when generating the permutations.

**Rubric:** 2pts for the `base_diff_attend`

(autograder); 3pts for the `base_p_attend`

(autograder); 1pt for reporting the difference/p-value and the rejection or not decision (PDF); 1pt for correct error type (PDF); 1pt for correctly identifying what parameter controls this error (PDF).

## Question 3 (3 points)

Explain how the tests in questions 1 and 2 should inform whether it is better to use the CBT assignment or the actually attendence to the CBT meetings as the treatment variable when estimating causal effects.

**Rubric:** 2pts for correctly identifying which variable should be used (PDF); 1pt for justification (PDF).

## Question 4 (7 points)

Calculate the estimated ATE of the CBT treatment assignment on the short-term index measure of anti-social behavior (`fam_asb_st`

) and save the estimate as `ate`

.

Generate 1000 simulations from the null distribution of a permutation hypothesis test of the null hypothesis that there is no treatment effect. Save this distribution as `ate_null_dist`

. Plot this distribution of the difference in means under the null hypothesis along with the observed estimate. Explain, in substantive terms of this setting, what this distribution represents.

Calculate the two-sided p-value and save it as `ate_p`

. Is the estimated effect statistically significant when alpha is 0.05?

**NOTE:** For this problem, set the seed to 02138 at the top of the chunk and use `reps = 1000`

when generating the permutations.

**Rubric**: 2pts for `ate`

(autograder); 2pts for the `ate_null_dist`

(autograder); 1pt for `ate_p`

(autograder); 1pt for plot of the null distribution (PDF); 1pt for explanation of null distribution and reporting if null is rejected (PDF).

## Question 5 (8 points)

Calculate the estimated ATE of the CBT treatment assignment on the long-term index measure of anti-social behavior (`fam_asb_st`

) and save the estimate as `ate_lt`

.

Generate 1000 simulations from the null distribution of a permutation hypothesis test of the null hypothesis that there is no treatment effect. Save this distribution as `ate_lt_null_dist`

. Plot this distribution of the difference in means under the null hypothesis along with the observed estimate and the shaded region for the p-value using `visualize()`

and `shade_p_value()`

.

Calculate the two-sided p-value and save it as `ate_lt_p`

. Is the estimated effect statistically significant when alpha is 0.05?

In the write up, describe what this and the ATE estimated in question 4 mean for the persistence of the effect of CBT over time?

**NOTE:** For this problem, set the seed to 02138 at the top of the chunk and use `reps = 1000`

when generating the permutations.

**Rubric**: 2pts for `ate_lt`

(autograder); 2pts for the `ate_lt_null_dist`

(autograder); 1pt for `ate_lt_p`

(autograder); 1pt for plot of the null distribution (PDF); 2pts for reporting if null is rejected and describing the persistence over time (PDF).