Statistical machine learning methods for spatiotemporal inference and learning

Speaker: Dr Seth Flaxman (Imperial College London)

Venue: Michael Smith Lecture Theatre, Dover St, Manchester M13 9PT

Abstract: Geocoded, time-stamped event datasets are ubiquitous in public health and public policy more broadly. In this talk I will highlight the statistical machine learning methods that I am developing for learning and inference with large-scale crime datasets, based on methods originally developed by spatial statisticians for modelling diseases.

I will start by describing a scalable inference method for the log-Gaussian Cox Process [Flaxman et al, ICML 2015] allowing us to efficiently fit a point pattern dataset of n = 233,088 crime events over a decade in Chicago and discover spatially varying multiscale seasonal trends and produce highly accurate long-range local area forecasts.

Building on this work, we used scalable approximate kernel methods to provide a winning solution to the US National Institute of Justice "Real-Time Crime Forecasting Challenge," providing forecasts of four types of crime at a very local level (less than 1 square mile) 1 week, 1 month, and 3 months into the future


In another line of work, we use a Hawkes process model to quantify the spatial and temporal scales over which shooting events diffuse in Washington, DC, using data collected by an acoustic gunshot locator system, in order to assess the hypothesis that crime is an infectious process. While we find robust evidence for spatiotemporal diffusion, the spatial and temporal scales are extremely short (126 meters and 10 minutes), and thus more likely to be consistent with a discrete gun fight, lasting for a matter of minutes, than with a diffusing, infectious process linking violent events across hours, days, or weeks [Loeffler and Flaxman, Journal of Quantitative Criminology 2017]

Papers and replication code available at