Modeling Verdict Outcomes Using Social Network Measures: The Watergate and Caviar Network Cases
Author
dc.contributor.author
Masías Hinojosa, Víctor
Author
dc.contributor.author
Valle, Mauricio
Author
dc.contributor.author
Morselli, Carlo
Author
dc.contributor.author
Crespo, Fernando
Author
dc.contributor.author
Vargas, Augusto
Author
dc.contributor.author
Laengle Scarlazetta, Sigifredo
Admission date
dc.date.accessioned
2016-06-13T15:56:22Z
Available date
dc.date.available
2016-06-13T15:56:22Z
Publication date
dc.date.issued
2016
Cita de ítem
dc.identifier.citation
PLoS ONE 11 (1): e0147248 (2016)
en_US
Identifier
dc.identifier.other
DOI: 10.1371/journal.pone.0147248
Identifier
dc.identifier.uri
https://repositorio.uchile.cl/handle/2250/138746
General note
dc.description
Artículo de publicación ISI
en_US
Abstract
dc.description.abstract
Modelling criminal trial verdict outcomes using social network measures is an emerging research area in quantitative criminology. Few studies have yet analyzed which of these measures are the most important for verdict modelling or which data classification techniques perform best for this application. To compare the performance of different techniques in classifying members of a criminal network, this article applies three different machine learning classifiers-Logistic Regression, Naive Bayes and Random Forest-with a range of social network measures and the necessary databases to model the verdicts in two real-world cases: the U.S. Watergate Conspiracy of the 1970' s and the now-defunct Canada-based international drug trafficking ring known as the Caviar Network. In both cases it was found that the Random Forest classifier did better than either Logistic Regression or Naive Bayes, and its superior performance was statistically significant. This being so, Random Forest was used not only for classification but also to assess the importance of the measures. For the Watergate case, the most important one proved to be betweenness centrality while for the Caviar Network, it was the effective size of the network. These results are significant because they show that an approach combining machine learning with social network analysis not only can generate accurate classification models but also helps quantify the importance social network variables in modelling verdict outcomes. We conclude our analysis with a discussion and some suggestions for future work in verdict modelling using social network measures.