Task Description




PSB 2016 Social Media Mining Shared Task Workshop


This workshop is a platform for teams to exercise their best NLP techniques applied to Social Media data. Specifically, to detecting and extracting mentions of adverse reactions. The workshop complements the Social Media Mining for Public Health Monitoring and Surveillance session.

Teams or individuals can participate in one or more of the proposed tasks, each posing distinct challenges.


Problem Background

Adverse drug reactions (ADRs), defined as accidental injuries resulting from correct medical drug use, present a serious and costly health problem contributing to 5.3% of all hospital admissions each year [1]. The process of detection, assessment, understanding, and prevention of these events is called pharmacovigilance [2]. To facilitate pharmacoviglance efforts, governments worldwide have diverse surveillance programs. One example, in the U.S., is MedWatch [3]; it enables both patients and providers to manually submit ADR information. However, these programs are chronically underutilized. A systematic review encompassing 12 countries, estimated an 85-94% under-reporting rate [3] of ADRs in local, regional, and national level reporting systems. To improve detection rates, researchers have begun turning to alternative sources of healthcare data, such as social media. Recent studies suggest that 26% of adult internet users discussed personal health issues online, with 42% of them discussing current conditions on social media and 30% reportedly changing their behavior as a result [4, 5]. Recent studies have focused on automatic classification of ADR assertive user posts [6, 7, 8, 9], and the automatic extraction of ADR mentions from posts [10, 11, 12, 13]. However, prior to our recent pilot studies [8, 12], public availability of data has been scarce, and a direct comparison of the approaches was not possible. Therefore, the release of a gold standard and the proposed task will foster advances on this topic.


Task Summary

The task is divided into three subtasks: (i) automatic classification of Adverse Drug Reaction (ADR) assertive user posts, (ii) automatic extraction of ADR mentions from user posts, and (iii) normalization ADR mentions into UMLS (Unified Medical Language System) concept IDs. The task will take advantage of a large expert annotated data from Twitter that has already been made publicly available. The task is designed to capitalize on the interest in social media mining and appeal to a diverse set of researchers working on distinct topics such as natural language processing, biomedical informatics, and machine learning. The task presents a number of interesting challenges including the noisy nature of the data, the informal language of the user posts, misspellings, and data imbalance.



Task 1: Binary Classification of ADRs

The first proposed sub-task focuses on automatic classification of ADR assertive user posts. This task will utilize the binary annotations in the data. Participants will be provided with a training/development set, containing the annotations. Evaluation will be performed on a blind set not released prior to the evaluation deadline. Systems will be evaluated on their ability to automatically classify ADR containing posts.



The training data consists of 7,574 instances (~70% of the original corpus) containing binary annotations. The evaluation set consists of 3,284 instances with a similar ADR to nonADR ratio as the training set. For each tweet, the publicly available data set contains: (i) the user ID, (ii) the tweet ID, and (iii) the binary annotation indicating the presence or absence of ADRs, as shown below. The evaluation data will contain the same information, but without the classes. Participating teams should submit their results in the same format as the training set (shown below).


User ID Tweet ID Class

349294537367236611 149749939 0

354256195432882177 54516759 0

352456944537178112 1267743056 1


Details about the download script and the data are available at: task 1 data


Task 2: ADR Extraction

This sub-task is a Named Entity Recognition (NER) task, and the aim is to automatically extract the ADR mentions reported in user posts. This includes identifying the text span of the reported ADRs. Participants may use advanced machine learning systems to extract the mentions and correctly distinguish ADRs from similar non-ADR mentions.



The data for this sub-task includes 2000+ tweets which are fully annotated for mentions of ADR and indications (reasons to use the drug). This set contains a subset of the tweets from sub-task 1 that were tagged as hasADR plus a random set of 800 nonADR tweets. The nonADR subset was annotated for mentions of indications, in order to allow participants to develop techniques to deal with this confusion class. The annotations are stored in a text file that contains the following details for each annotation: tweet ID, start offset, end offset, semantic type (ADR/Indication), UMLS ID, annotated text span and the related drug.


Participating teams must submit their results on the test set in the same format as the training set.


The data is available at: task 2 data


Task 3: Normalization of ADR mentions

This is a concept normalization task. Given an ADR mention in natural language (colloquial or other), participant systems are required to identify the UMLS concept ID for the mention.



Training data will consist of a set of ADR mentions and their corresponding, human-assigned UMLS CUIs, as shown below. Submissions should follow an identical format.


Schizophrenia c0036341

tension in my nerves c0027769

shaking c0040822


Systems will be evaluated based on the closeness of their predictions to the gold standard. A system prediction will be considered correct if the predicted CUI is identical, is a synonym, or has a is-a relationship to the gold standard concept.


The data for this task can be found at: task 3 data



Specific evaluation details for each task will be posted here soon.




To register, send an email to Abeed Sarker (abeed.sarker@asu.edu) with the following information:

         Name of your team;

         Names of team members and their affiliations.

We will send you a confirmation message once the registration is completed.



May 15, 2015: release of training data

August 15, 2015: release of evaluation data

August 20, 2015: deadline for submissions

September 1, 2015: release of results and ranks

October 1, 2015: system descriptions due


Task Organizers

Dr. Graciela Gonzalez (ggonzal@asu.edu), Arizona State University

Dr. Abeed Sarker (abeed.sarker@asu.edu), Arizona State University

Azadeh Nikfarjam (anikfarj@asu.edu), Arizona State University 

Queries to: 
Dr. Abeed Sarker (abeed.sarker@asu.edu)




Please upload your file using the following link:


Submission page


Name your files as: TeamName_AssignedTeamNumber_TaskNumber


Example: DiegoLab_21_1



[1] C. Kongkaew, P. R. Noyce, and D. M. Ashcroft, Hospital admissions associated with adverse drug reactions: a systematic review of prospective observational studies, Ann. Pharmacother., vol. 42, no. 7, pp. 1017:1025, 2008.

[2] World Health Organization. The importance of pharmacovigilance. World Health Organization, 2002.

[3] Office of the Commissioner, MedWatch: The FDA Safety Information and Adverse Event Reporting Program. [Online]. Available: http://www.fda.gov/Safety/MedWatch/default.htm. [Accessed: 28-Sep-2014].

[4] J. Parker, Y. Wei, A. Yates, O. Frieder, and N. Goharian, A framework for detecting public health trends with Twitter, in Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, 2013, pp. 556:563.

[5] Twenty six percent of online adults discuss health information online; privacy cited as the biggest barrier to entry | Business Wire. [Online]. Available: http://www.businesswire.com/news/home/20121120005872/en/Twenty-percent-online-adultsdiscuss-healthinformation#.UvQ4M4WmWGQ. [Accessed: 07-Feb-2014].

[6] K. Jiang, Y. Zheng, Mining Twitter Data for Potential Drug Effects, Advanced Data Mining and Applications 8346 (2013) 434:443.

[7] J. Bian, U. Topaloglu, F. Yu. Towards largescale twitter mining for drug-related adverse events, in: Proceedings of the 2012 international workshop on Smart health and wellbeing, 2012, pp. 25:32.

[8] R. Ginn, P. Pimpalkhute, A. Nikfarjam, A. Patki, K. O'Connor, A. Sarker, K. Smith, G. Gonzalez, Mining Twitter for Adverse Drug Reaction Mentions: A Corpus and Classification Benchmark, in: Proceedings of the Fourth Workshop on Building and Evaluating Resources for Health and Biomedical Text Processing, 2014.

[9] A. Patki, A. Sarker, P. Pimpalkhute, A. Nikfarjam, R. Ginn, K. O'Connor, K. Smith, G. Gonzalez, Mining Adverse Drug Reaction Signals from Social Media: Going Beyond Extraction, in: Proceedings of BioLinkSig 2014, 2014.

[10] R. Leaman, L. Wojtulewicz, R. Sullivan, A. Skariah, J. Yang, G. Gonzalez, Towards Internet-Age Pharmacovigilance: Extracting Adverse Drug Reactions from User Posts to HealthRelated Social Networks, in: Proceedings of the 2010 Workshop on Biomedical Natural Language Processing, 2010, pp. 117:125.

[11] A. Nikfarjam, G. Gonzalez, Pattern Mining for Extraction of Mentions of Adverse Drug Reactions from User Comments, in: Proceedings of the American Medical Informatics Association (AMIA) Annual Symposium, 2011, pp. 1019:1026.

[12] K. O'Connor, A. Nikfarjam, R. Ginn, P. Pimpalkhute, A. Sarker, K. Smith, and G. Gonzalez, Pharmacovigilance on Twitter? Mining Tweets for Adverse Drug Reactions, in American Medical Informatics Association (AMIA) Annual Symposium, 2014.

[13] A. Yates, N. Goharian, ADRTrace: detecting expected and unexpecfted adverse drug reactions from user reviews on social media sites, in: Proceedings of the 35th European conference on Advances in Information Retrieval, 2013, pp. 816:819.

DIEGO LAB 2015. Email: Competition Organisers.