Twitter for marijuana infodemiology
Abstract
Today online social networks seem to be good tools to quicklymonitor what is going on with the population, since they provideenvironments where users can freely share large amounts of infor-mation related to their own lives. Due to well known limitationsof surveys, this novel kind of data can be used to get additionalreal time insights from people to understand their actual behaviorrelated to drug use. The aim of this work is to make use of text mes-sages (tweets) and relationships between Chilean Twitter users topredict marijuana use among them. To do this we collected Twitteraccounts using a location-based criteria, and built a set of featuresbased on tweets they made and ego centric network metrics. To gettweet-based features, tweets were filtered using marijuana-relatedkeywords and a set of 1000 tweets were manually labeled to trainalgorithms capable of predicting marijuana use in tweets. In addi-tion, a sentiment classifier of tweets was developed using the TASScorpus. Then, we made a survey to get real marijuana use labelsrelated to accounts and these labels were used to train supervisedmachine learning algorithms. The marijuana use per user classifierhad precision, recall and F-measure results close to 0.7, implyingsignificant predictive power of the selected variables. We obtained amodel capable of predicting marijuana use of Twitter users and esti-mating their opinion about marijuana. This information can be usedas an efficient (fast and low cost) tool for marijuana surveillance,and support decision making about drug policies.
Indexation
Artículo de publicación SCOPUS
Quote Item
WI ’17, August 23-26, 2017, Leipzig, Germany
Collections