Algorithmic approach to forecasting rare violent events

Research Mass violence, almost no matter how defined, is (thankfully) rare. Rare events are difficult to study in a systematic manner. Standard statistical procedures can fail badly, and usefully accurate forecasts of rare events often are little more than an aspiration. We offer an unconventional a...

Full description

Saved in:  
Bibliographic Details
Authors: Berk, Richard (Author) ; Sorenson, Susan B. (Author)
Format: Electronic Article
Language:English
Published: 2020
In: Criminology & public policy
Year: 2020, Volume: 19, Issue: 1, Pages: 213-233
Online Access: Volltext (Resolving-System)
Journals Online & Print:
Drawer...
Check availability: HBZ Gateway
Keywords:
Description
Summary:Research Mass violence, almost no matter how defined, is (thankfully) rare. Rare events are difficult to study in a systematic manner. Standard statistical procedures can fail badly, and usefully accurate forecasts of rare events often are little more than an aspiration. We offer an unconventional approach for the statistical analysis of rare events illustrated by an extensive case study. We report research aimed at learning about the attributes of very-high-risk intimate partner violence (IPV) perpetrators and the circumstances associated with their IPV incidents reported to the police. "Very high risk" is defined as having a high probability of committing a repeat IPV assault in which the victim is injured. Such individuals represent a very small fraction of all IPV perpetrators; these acts of violence reported to the police are rare. To learn about them nevertheless, we sequentially apply in a novel fashion three algorithms to data collected from a large metropolitan police department: stochastic gradient boosting, a genetic algorithm inspired by natural selection, and agglomerative clustering. We try to characterize not just perpetrators who on balance are predicted to reoffend but also who are very likely to reoffend in a manner that leads to victim injuries. Important lessons for forecasts of mass violence are presented. Policy Implications If one intends to forecast mass violence, it is probably important to consider approaches less dependent on statistical procedures common in criminology. Given that one needs to "fatten" the right tail of the rare events distribution, a combination of supervised machine learning and genetic algorithms may be a useful approach. One can then study a synthetic population of rare events almost as if they were an empirical population of rare events. Variants on this strategy are increasingly common in machine learning and causal inference. Our overall goal is to unearth predictors that forecast well. In the absence of sufficiently accurate forecasts, scarce resources to help prevent mass violence cannot be allocated where they are most needed.
ISSN:1745-9133
DOI:10.1111/1745-9133.12476