Friday, July 4, 2008

NIBRS and the UCR

Long-time readers (all 3 of you) will know of my interest in the media violence and behavioral outcome literature. Part of this interest is due to having read Dellavigna and Dahl's working paper that matched movie release dates (by violence category) to crime outcomes using the National Incident-Based Reporting System. They find, interestingly, a negative association between a violent movie and crime - quite the opposite as one would expect from the violence in media literature by psychologists. It looks like they have updated the paper, so make note that there is now a June 7th, 2008 version online. My mind quickly forgets things from one moment to the next, let alone papers I read months ago, but from glancing at it, I don't remember them using box office revenue data before. But now, it appears they are doing this, which is obviously the way to go. You want to focus on extremely popular violent movies, since this represents the largest treatment effect. Small, insignificant, and unknown violent movies should have little to no effect on violent crime outcomes, and if we found that popular violent films weren't driving these results, but rather were being driven by the unknown films, then frankly, I think we'd have sufficient reason to be skeptical that the results weren't merely representative of some unknown, unobservable background variable common to the release of a movie and crimes.

One of the things I like about this paper is its innovative use of the NIBRS dataset, though. I'm working with NIBRS now, and will be again in the fall, so I've had some chance to learn a little about it. It is unlike the FBI's long-running crime surveillance dataset, the Uniform Crime Reports, though it is actually a part of that larger surveillance program. What makes NIBRS interesting is, first of all, it covers "crime incidents" and not just arrests. The data has information on all "crime incidents." It's possible that the person is arrested for something other than the offenses listed, or maybe there are multiple offenses listed for one single incident (a person robs a bar, then goes into the bathroom to look for more victims, finds a woman there, and then rapes her. Two offenses, but one incident). NIBRS lets you study things like this. So it is much larger in scope than the traditional UCR data, because it includes more information on these particular incidents. From it you can get detailed information on the different offenses in the incident, the offender(s), the victim(s), the property damage, the arrestee(s) and other things along those lines.

You can also get the calendar date of the incident (any of the 365 days of the year in other words) plus the military time of day (1-24 for the 24 hours of the day). This is what makes the Dellavigna and Dahl paper pretty interesting. They isolate the crimes by the actual day and hour associated with a violent movie's release. This means they're getting to use variation within a year, or a month even, on movie releases and crime weekends. As I work more on a project that is similar to theirs in design and purpose, I'll probably have more thoughts on the strengths and weaknesses of this approach, but I do wonder if this approach can effectively control for the seasonality of crime and the seasonality of violent films. For instance, I know that summer blockbusters are seasonal and typically popcorn-fluff. Lots of action, and violence. Yet their release isn't random - they are correlated with certain months. Well, those are also months in which kids are let out of school, and not being incapacitated by school 8 hours a day, one wonders if there's not an increase in crime associated with this season. So effectiveley, Dellavigna and Dahl need to exploit variation within some seasonality to identify the effect, and I think they are doing this for the most part because their design lets them focus on days and hours, within-seasons. I'll need to think more about it, but this seems to me to get around that complaint.

What's interesting, too, is that they find in their paper that one of the few movies that is associated with a positive increase in violent crime is when The Passion of the Christ was released. Now, this is an interesting result, because if you remember at the time, there were worries of anti-semitism surrounding this movie - both because it was believed to be an anti-semitic movie, but also because it is believed to potentially promote an anti-semitic backlash among WASPs or at least hate groups. I read of one such incident at the time, but it's been a while, but have never seen anything else on it. Well, in this working paper, the authors find that it is one of the only movies in their sample to be positive associated with an increase in social violence/crime. What would be interesting, as way of determining whether it's spurious, is to exploit the "bias motivator" variable in the "offense segment" of the data. This "bias motivator" is a variable that states whether there was any perceived bias related to the offense by the reporting police officer. Values can be all sorts of "anti-racial," "anti-ethnic," anti-sexual orientation," "anti-religious" and so on. There's about 15-20 different kinds of biases recorded. It'd be interesting to see if this is an increase in biased crimes, in other words. They do not appear to exploit that variable in their current draft, though. I would also worry about possible endogeneity with regards to that particular film. For instance, there was a lot of talk of possible violent hate crimes happening when the movie came out. Is it possible police increased their efforts in the weekend of the movies' opening, and in so doing, increased the number of offenses reported? If police are increasing law enforcement effort in response to the film's release, then the positive association they find wouldn't be representative of the movie causing an increase in violence, so much as it would simply reflect an increase in effort. There should be some way to test this, even if it just involved calling police jurisdictions to see if that weekend was any different.

Anyhow, the NIBRS is an interesting dataset. It is not, like all datasets, perfect though. For one, even by 2005, the most recent year for which we have data, only 32 states were participants. The Passion of the Christ was released in 2004, though, in which only 30 states participated. Some of the more obvious missing states are California, Florida, Illinois and New York - fairly important areas for violence and movie venue, but they're important missing areas for any study of crime quite frankly. In 2004, the states participating in NIBRS were: AZ, AR, CO, CT, DE, DC, GA, ID, IA, KS, KY, LA, ME, MA, MI, NB, NH, ND, OH, OR, RI, SC, SD, TN, TX, UT, VT, VA, WV, and WI. I'm also not entirely sure how many jurisdictions within a state report this data. Is it every single city and county jurisdiction in a state? The NIBRS manual and codebook doesn't come out and say exactly what is and is not in here, so I'm not sure.

But, nonetheless, used in combination with other crime data sources, NIBRS has many appealing features, and once you read the codebook closely, it's not impossible to work with - though I do recommend that you read the entire codebook first. You'll be glad that you did, because the data comes to you in 13 different "segments," and you really need to merge some of them together to get the full power of the data. While merging is technically easy with most statistical packages, it's nonetheless conceptually a little confusing with NIBRS, so you have to spend some time with it just making sure you know what's what.

No comments: