Can You Say Crime-Prediction Software Is Bad without Considering Whether It Is Accurate?

(carlballou/Getty Images)

Aaron Sankin, Dhruv Mehrotra, Surya Mattu, and Annie Gilbertson of The Markup and Gizmodo have a lengthy supposed exposé of PredPol, a predictive software used by police departments to help analyze crime patterns to predict where police should be deployed. The tone is breathless:

Residents of neighborhoods where PredPol suggested few patrols tended to be Whiter and more middle- to upper-income. Many of these areas went years without a single crime prediction. By contrast, neighborhoods the software targeted for increased patrols were more likely to be home to Blacks, Latinos, and families that would qualify for the federal free and reduced lunch program. These communities weren’t just targeted more—in some cases they were targeted relentlessly. Crimes were predicted every day, sometimes multiple times a day, sometimes in multiple locations in the same neighborhood: thousands upon thousands of crime predictions over years. A few neighborhoods in our data were the subject of more than 11,000 predictions. The software often recommended daily patrols in and around public and subsidized housing, targeting the poorest of the poor.

The implication here is that PredPol looks at the race and economic status of neighborhoods instead of the frequency of crime in these neighborhoods, so it is advising police to patrol them more heavily for no good reason, or worse, for a very bad reason. Ibram X. Kendi, the patron saint of junk statistics on racial bias, was among those chiming in to promote the piece to his 413,000 Twitter followers:

There’s one huge, glaring problem: In a column that runs some 6,500 words and an accompanying methodology explainer twice that long, the authors never attempt to evaluate whether or not PredPol is accurately predicting patterns of crime. You have to read some 80  paragraphs into the story to learn this:

We did not try to determine how accurately PredPol predicted crime patterns. Its main promise is that officers responding to predictions prevent crimes by their presence. But several police departments have dropped PredPol’s software in recent years, saying they didn’t find it useful or couldn’t judge its effectiveness…“As time went on, we realized that PredPol was not the program that we thought it was when we had first started using it,” Tracy Police Department chief of staff Sgt. Craig Koostra said in a written statement. He did not respond to a request to elaborate.

As the explainer elaborates:

We were unable to investigate the “accuracy” of PredPol predictions—whether predicted crimes occurred on predicted days in predicted locations—nor do we know how each agency chose to respond to each prediction. As mentioned earlier, we asked every department to provide data about officer responses to PredPol predictions, which PredPol calls “dosage,” but only Plainfield and Portage provided any of that data. It is possible that some officers ignore PredPol reports entirely. Records for Plainfield showed officers responding to less than 2 percent of the total predictions that PredPol made for the department. How much of this is due to incomplete reporting by the department is impossible to know.

In other words, all the rhetoric about “bias” simply assumes the very question at issue: whether PredPol has a racial or class bias compared with a perfectly accurate prediction of crime. In fact, the authors admit that a study that attempted to reduce the disparities ended up doing so by making the software less accurate, and was rejected by the company because it would divert cops away from areas whose residents were victimized by high crime:

The study authors developed a potential tweak to the algorithm that they said resulted in a more even distribution of crime predictions. But they found its predictions were less in line with later crime reports, making it less accurate than the original algorithm, although still “potentially more accurate” than human predictions. [Brian MacDonald, CEO of PredPol] said the company didn’t adjust its software in response. “Such a change would reduce the protection provided to vulnerable neighborhoods with the highest victimization rates,” he said.

As is customary in this kind of writing, many thousands of words are devoted to the plight of particular people stopped, investigated, or arrested by police, but the authors cannot spare even a momentary empathy for the victims of crime. In fact, in their one effort to argue that crime statistics are biased, they put the blame on black and Hispanic crime victims for reporting their own victimization to the police:

“We use crime data as reported to the police by the victims themselves,” [MacDonald] said. “If your house is burglarized or your car stolen, you are likely to file a police report.” But that’s not always true, according to the federal Bureau of Justice Statistics (BJS). The agency found that only 40 percent of violent crimes and less than a third of property crimes were reported to police in 2020, which is in line with prior years. The agency has found repeatedly that White crime victims are less likely to report violent crime to police than Black or Latino victims. In a special report looking at five years of data, BJS found an income pattern as well. People earning $50,000 or more a year reported crimes to the police 12 percent less often than those earning $25,000 a year or less. This disparity in crime reporting would naturally be reflected in predictions. “There’s no such thing as crime data,” said Phillip Goff, co-founder of the nonprofit Center for Policing Equity, which focuses on bias in policing. “There is only reported crime data. And the difference between the two is huge.”

Well, yes, it undoubtedly is true that if you rob poorer people, they are likelier to feel the harm and less likely to just file an insurance claim. (Disparities in reasons for crime reporting is also one reason why murder rates are the most reliable source of information on crime, because the vast majority of murders are discovered or reported). On the other hand, we hear an awful lot from law-enforcement critics about how non-white people mistrust the police, but the fact that they are more likely to seek the assistance of police would appear to show that they actually want the cops to fight crime in their neighborhoods.

For good measure, the authors cite an open letter from ten academics in the summer of 2020 — when a lot of things were written and said that will look increasingly ludicrous with the passage of time — “call[ing] on the mathematics community to boycott working with police departments.” “Given the structural racism and brutality in US policing,” the signatories argued, “we do not believe that mathematicians should be collaborating with police departments in this manner. It is simply too easy to create a ‘scientific’ veneer for racism.” The open letter did not make even the slightest effort to show that mathematicians were making policing worse or that the use of data was inaccurate or biased; it simply asserted that policing equals racism, and so counting the numbers must be racist, too.

Is PredPol actually a good program? I have no idea. Certainly nothing in the Markup article or its accompanying explainer actually showed that it is, or is not, an accurate predictor of crime, or that it was used by any police department in a way that made its policing better or worse. Good police work has relied on crime data as far back as 19th-century London; the New York City police department was producing crime maps as early as 1900. The policing revolution of the past 30 years, kicked off with the NYPD’s introduction of the Compstat program in 1995, has increasingly used data to drive decisions about where and how to deploy police resources to both catch and deter criminals. But as we know in any field, blindly relying on data has its limitations. It is always good to provide some checks on data with common sense and experience, and to continually evaluate how accurate one’s data tools are — including whether they simply replicate biases baked into the numbers that are fed into them. But hyperventilating about bias without bothering to consider the actual reality of crime and its victims is just shoddy yellow journalism.