Strata Conference Santa Clara, CA Feb 27, 2013 http://strataconf.com/strata2013/public/schedule/detail/27443 At Strata 2012 in New York, we discussed the hazards of curbing big data inferences by defining a new category of thoughtcrime. After all, acting on thoughts might constitute a crime, but thoughts, in isolation, cannot be criminal. It’s time to go deeper. Let’s create and evaluate a predictive criminal model that highlights where the sensitivities lie, both technically and ethically. Over the last decade, Intelius has built a people-centric big data platform — what we call the inome platform. We’ll use it and our criminal database of several hundred million U.S. criminal records to train and evaluate a predictive criminal model. As part of this talk, we’ll release the model and some of the inome machine-learning scaffolding code. What makes big data so scary is that, for the first time, we are leveraging huge data mines to make inferences outside the wisdom of our own minds. Is it possible to predict, with meaningful recall and acceptable precision, who might commit a crime? We’ll showcase our model’s shortcomings due to inescapable precision/recall trade-offs — false negatives miss criminals while false positives indict the innocent. And even if we could build a perfect predictor, does a powerful government have the right to use it and eclipse free will?