PSB Workshops

Pacific Symposium on Biocomputing

Big Island of Hawaii - January 4-8, 2025

PSB is offering four workshops during the meeting. These workshops were created to provide an opportunity for a gathering that will not be based on peer-reviewed papers included in the proceedings book. The workshops will consist of presentations by invited speakers. Abstract submissions for the workshops will be evaluated by the workshop co-chairs.

Each workshop has a chair who is responsible for organizing submissions. Please contact the specific workshop chair relevant to your interests for further information. Links on each of the titles below lead to more detailed calls for participation.

All Together Now: Data Work to Advance Privacy, Science, and Health in the Age of Synthetic Data

Organizers: Lindsay Fernandez-Rhodes, Jennifer K. Wagner

Description: Transparency alone is insufficient to bridge the gap between data practices and public understanding in an era of synthetic data and artificial intelligence. “Data work” is needed to have scientific data be deemed trustworthy, meaningful and actionable–not only for scientists but also for the individuals from whom those data were derived or to whom those data relate. This workshop is intended to nurture a discussion of these issues and describe the interdisciplinary competencies necessary to become equipped to embed ethical, legal and social considerations into their research.

Contact: Lindsay Fernandez-Rhodes
Email: Fernandez-rhodes at psu dot edu


Command Line to PipeLine: Cross-Biobank Analyses with Nextflow

Organizers: Anurag Verma, Lindsay Guare, Katie Cardone, Christopher Carson, Zachary Rodriguez

Biobanks hold immense potential for genomic research, but fragmented data and incompatible tools slow progress. This workshop equips you with Nextflow, a powerful workflow language to streamline bioinformatic analyses across biobanks. Write code in your preferred language and Nextflow handles the complexities, ensuring consistent, reproducible results across different platforms. This interactive session is ideal for beginner-to-intermediate researchers who want to (1) Leverage biobank data for genomic discoveries, (2) Build portable and scalable analysis pipelines, (3) Ensure reproducibility in their findings, (4) Gain hands-on experience through presentations, demonstrations, tutorials, and discussions with bioinformatics experts.

Contact: Anurag Verma
Email: anurag.verma at pennmedicine dot upenn dot edu


Leveraging Foundational Models in Computational Biology: Validation, Understanding, and Innovation

Organizers: Steven Brenner, Brett Beaulieu-Jones

Large Language Models (LLMs) have demonstrated immense potential within and outside of the biomedical domain but currently have substantial limitations when applied to biomedical research. These models promise a new paradigm for data analysis, interpretation and hypothesis generation, but it is not clear how fully this promise will be fulfilled. LLMs are just one class of foundational models, and while they have already made a significant impact to computational biology, it is unlikely that a singular architecture geared at processing natural language will be the ideal framework for general learning in computational biology. This workshop aims to provide an understanding of the state of the art today, current challenges in the application or development of models tailored to computational biology, as well as to start a discussion of what the future holds for our community.

Contact: Brett Beaulieu-Jones
Email: beaulieujones at uchicago dot edu


Opportunities and Pitfalls with Large Language Models for Biomedical Annotation

Organizers: Fabio Rinaldi, Jin-Dong Kim, Zhiyong Lu, Cecilia Arighi

LLMs and biomedical annotations have a symbiotic relationship. LLMs rely on high-quality annotations for training and improvement, while they can also automate parts of the annotation process and improve its quality.

High-quality, well-annotated biomedical data is crucial for training LLMs to understand and process scientific information. These annotations can include labeling entities (genes, proteins), relations (interactions), and other relevant information. By incorporating annotated data, LLMs can learn specific domain knowledge and improve their accuracy in tasks like information extraction, knowledge base creation, and text summarization. Diverse and unbiased annotations can help mitigate bias in LLMs, ensuring their outputs are fair and representative of the underlying data.

LLMs can be used to automate some aspects of annotation, such as identifying potential entities or suggesting relevant relations. This can significantly reduce the workload for human annotators. LLMs can identify areas of uncertainty in the data and suggest which annotations would be most valuable for improving their performance. This creates a feedback loop where LLMs guide the annotation process for optimal results. Finally, LLMs can be used to check the consistency and accuracy of annotations, identifying potential errors or inconsistencies.

By addressing these challenges this workshop aims to clarify the potential and limits of LLMs in advancing biomedical research and knowledge discovery.

Contact: Fabio Rinaldi
Email: fabio dot rinaldi at idsia dot ch