BANNER: An Executable Survey of Advances in Biomedical Named Entity Recognition

Robert Leaman 1, Graciela Gonzalez 2


1 Department of Computer Science and Engineering, Arizona State University, 2 Department of Biomedical Informatics, Arizona State University

Pac Symp Biocomput. 2008;:652-663.


Abstract

There has been an increasing amount of research on biomedical named entity recognition, the most basic text extraction problem, resulting in significant progress by different research teams around the world. This has created a need for a freely-available, open source system implementing the advances described in the literature. In this paper we present BANNER, an open-source, executable survey of advances in biomedical named entity recognition, intended to serve as a benchmark for the field. BANNER is implemented in Java as a machine-learning system based on conditional random fields and includes a wide survey of the best techniques recently described in the literature. It is designed to maximize domain independence by not employing brittle semantic features or rule-based processing steps, and achieves significantly better performance than existing baseline systems. It is therefore useful to developers as an extensible NER implementation, to researchers as a standard for comparing innovative techniques, and to biologists requiring the ability to find novel entities in large amounts of text. BANNER is available for download at http://banner.sourceforge.net.


[Full-Text PDF] [PSB Home Page]