PSB 2003 Tutorial |
PSB 2003
![]() |
The information age has made it easy to store large amounts of data. The
proliferation of the Biomedical literature available on the Web, on
corporate intranets, on news wires, and elsewhere is overwhelming.
However, while the amount of data available to us is constantly
increasing, our ability to absorb and process this information remains
constant. Search engines only exacerbate the problem by making more and
more documents available in a matter of a few key strokes.
Link Analysis is a new and exciting research area that tries to solve
the information overload problem by combining techniques from data
mining, machine learning, Information Extraction, Text Categorization,
Visualization and Knowledge Management. Link Analysis is the process of
building up networks of interconnected objects through various
relationships in order to discover patterns and trends. The main tasks
of link analysis are to extract, discover, and link together sparse
evidence from vast amounts of data sources, to represent and evaluate
the significance of the related evidence, and to learn patterns to guide
the extraction, discovery, and linkage of entities. The discovered
relationships could be transactional, geographical, social, or temporal.
Link Analysis for BioInformatics involves the preprocessing of
biomedical document collections (by using text categorization, term
extraction, and information extraction), integration with structured
information sources, the storage of the intermediate representations,
the techniques to analyze these intermediate representations
(distribution analysis, clustering, trend analysis, association rules,
etc.) and visualization of the results. In this tutorial we will present
the general theory of Link Analysis for BioInformatics, survey recent
work, and will demonstrate several systems that use these principles to
enable interactive exploration of a combination of structured and
unstructured collections.
We will present a general architecture of link analysis systems and
outline the algorithms and data structures behind the systems. The
Tutorial will cover the state of the art in this rapidly growing area of
research. Several real world applications of link analysis will be
presented.
Ronen Feldman is a senior lecturer at the Mathematics and Computer
Science Department of Bar-Ilan University in Israel, and the Director of
the Data Mining Laboratory. He received his B.Sc. in Math, Physics and
Computer Science from the Hebrew University, M.Sc. in Computer Science
from Bar-Ilan University, and his Ph.D. in Computer Science from Cornell
University in NY. He is the founder and president of ClearForest
Corporation, a NY based company specializing in development of text
mining tools and applications. He is also an Adjunct Professor at NYU
Stern Business School.
Hagit Shatkay is an Informatics Research scientist at the Informatics
Research group in Celera. She received her B.Sc. and M.Sc. in Computer
Science from the Hebrew University, and her Ph.D. in Computer Science
from Brown University. Her research is in the area of machine-learning
and probabilistic models. Her work during the last few years
concentrates on using the literature for biological data analysis.
Contact Information:
Ronen Feldman, Ph.D.
ClearForest Corporation
15 East 26th Street, Suite 1711
New York, NY, 10010
ronen@clearforest.com
Tel: 212-432-1515 (x203)
Fax: 212-432-1929