Crowdsourcing and Mining Crowd Data


Call for Papers and Posters


Submission: July 31, 2014 August 4, 2014

Pacific Symposium on Biocomputing

January 4-8, 2015

Fairmont Orchid Resort, Kohala Coast

The Big Island of Hawaii, USA



Modern biomedical datasets are frequently large, heterogeneous, contain a significant amount of information of unknown relevance, and / or are created for purposes other than the analysis being performed ('big data'). These trends have motivated increasing research into a form of large-scale collaboration known as crowdsourcing - the use of data from a broad population, often the general public. This session takes a wide view of crowdsourcing, defining the term to mean both the distribution of data analysis via an open call but also the contribution of data.


Crowdsourcing approaches have diversified as the field has begun to define itself. Well-known techniques for data analysis include microtask environments such as Amazon's Mechanical Turk - where semi-anonymous workers are paid to perform discrete tasks, games with a purpose such as FoldIt and collaborative content creation frameworks such as Wikipedia. The collaborative sourcing of data has also significant uses including the collection of genomic data through sites such as 23andMe and experiments performed by online patient communities such as PatientsLikeMe. The advantages of such techniques include reduced cost, increased data sizes, environments closer to those in the real world and also increases in public participation in science. Yet these techniques remain challenging for several reasons; one of the foremost is the heterogeneity of the population responding to an open call and the resulting data, necessitating robust analysis methods such as techniques for data filtering and aggregation.



This session focuses on biomedical discovery thorough crowd data. We hope to include in this session a broad array of applications employing crowdsourcing techniques spanning both text and data mining. Solicited topics include both methodological advances in learning from crowd data and its applications within biocomputing. Types of crowd data considered in this session include but are not limited to:

  • Human genomics sequence data

  • 1000 genomes toxicity screening data

  • Wikipedia data

  • Electronic health record (EHR) data

  • Social media data (e.g. twitter, blogs, reviews)

  • Query log data

  • Amazon Mechanical Turk

  • Citizen science data

  • Online computer gaming

  • Open innovation contests


Example bioinformatics applications include:

  • Annotating gene functions, pathways, etc

  • Bio-data cataloging and curation

  • Health surveillance (flu, asthma, drug side effect, etc)

  • Ontology development

  • Pharmacovigilance

  • Protein structure prediction

  • Toxicity testing

  • Validation of computational or wet-lab results


Key dates

Manuscript submission: July 31, 2014 Extended to August 4, 2014

Notification to authors: September 10, 2014

Camera-ready papers due: October 1, 2014

Deadline for poster abstracts: November 17, 2014



Robert Leaman, National Center for Biotechnology Information (NCBI),

Benjamin Good, Scripps Research Institute,

Andrew Su, Scripps Research Institute,

Zhiyong Lu, National Center for Biotechnology Information (NCBI),



The scientific core of the conference consists of rigorously peer-reviewed full-length papers reporting on original work. Accepted papers will be published in an archival proceedings volume (fully indexed in PubMed), and a number of the papers will be selected for presentation during the conference. Researchers wishing to present their research without official publication are encouraged to submit a one-page abstract, and present their work in a poster session.


Please note that the submitted papers are reviewed and accepted on a competitive basis. At least three reviewers will be assigned to each submitted manuscript.


Paper Format

Please see the PSB paper format template and submission instructions at

Submitted papers are limited to twelve (12) pages in the PSB publication format.



Robert Leaman,