In the scenic setting of Lake Barkley State Park, Kentucky, a diverse group of scientists gathered to shape the future of biological research.
Imagine trying to understand a complex machine by studying just one or two of its components. For decades, this was the challenge biologists faced. However, a revolution was underway in 2008, as scientists began leveraging computational power to analyze living systems in their entirety.
The Seventh Annual UT-ORNL-KBRIN Bioinformatics Summit, held from March 28-30, 2008, served as a critical platform where this revolution was unfolding 1 .
This summit brought together 174 researchers, faculty, and students from institutions across Tennessee and Kentucky, creating a unique collaborative environment where multidisciplinary approaches to biological problems were not just encouraged, but essential 1 . They convened at a pivotal moment, grappling with a fundamental shift in how we understand life's blueprint—the very concept of a gene was being redefined in light of new data 1 . The work presented here laid foundational stones for the predictive, personalized medicine we are striving to achieve today.
Before diving into the summit's specifics, it's helpful to understand what bioinformatics is. Simply put, it is an interdisciplinary field that combines biology, computer science, mathematics, and statistics to analyze and interpret biological data 2 . In the era of high-throughput technologies, where scientists can generate enormous volumes of genetic data, bioinformatics provides the essential tools and frameworks to manage, process, and extract meaning from this information deluge 4 .
Bioinformatics involves "the collection, comprehension, manipulation, classification, storage, extraction, animation and usage of all biological information with the use of computer technology" 2 .
Its applications are vast, spanning from gene therapy and drug discovery to understanding evolutionary relationships and improving agriculture 2 .
The presentations at the summit reflected a field rapidly maturing, moving from simple data collection to sophisticated analysis and prediction. The overarching goal was clear: to transition biology from a descriptive science to a predictive one. Researchers were no longer content with just identifying genes; they sought to understand their complex interactions and regulatory networks to forecast biological behavior 1 .
A key theme was the focus on systems biology—the idea that biological systems must be understood as a whole, rather than as a collection of isolated parts 1 . This holistic approach was evident across the three main plenary sessions: Pathways to Prediction, Biomedical Informatics, and Regulatory Analysis.
Understanding biological systems and processes at molecular level.
Developing algorithms and software for biological data analysis.
Applying statistical models to interpret complex biological data.
The session led by Nitin Baliga of the Institute for Systems Biology focused on creating predictive models for how cells function. His work investigated how extremophiles—organisms that thrive in harsh conditions—manage to survive. Using systems biology approaches that incorporated data on transcription, translation, and molecular interactions, his team aimed to derive dynamic temporal relationships that would allow them to predict cellular responses to environmental stresses 1 .
This was not merely academic. Understanding these robust biological circuits holds the key to engineering organisms for biotechnology and grasping the fundamental principles of cellular control.
Dan Masys of Vanderbilt University highlighted the growing role of informatics in clinical research. He presented the Vanderbilt Institute for Clinical and Translational Research (VICTR) and its research portal, StarBRITE 1 . This platform integrated tools for patient recruitment, data management, and electronic medical record (EMR) support, showcasing a real-world application of bioinformatics to accelerate the translation of scientific discoveries into patient treatments.
A particularly compelling talk in this session came from Mikael Benson, who discussed the hunt for epigenetic markers for personalized medicine, using hay fever as a model 1 . His approach involved decomposing transcriptional networks from DNA microarrays to identify genes that could later be analyzed for polymorphisms, potentially leading to new diagnostic markers for allergies 1 .
Ziv Bar-Joseph of Carnegie Mellon University addressed one of the most complex challenges in genomics: understanding dynamic gene regulatory networks. He pointed out limitations in analyzing time-series gene expression data and presented innovative computational solutions like STEM and DREM 1 . These tools allowed scientists to align gene expression patterns over time and model the bifurcating paths of regulatory events, providing a much clearer picture of how genes are switched on and off in processes ranging from yeast cell cycles to bacterial responses 1 .
Microarray technology was a workhorse of genomics in 2008, allowing researchers to measure the expression levels of thousands of genes simultaneously. A typical experiment presented at the summit might have followed this methodology to identify genes involved in a specific disease.
The following steps outline a generalized microarray study procedure, reflective of the approaches discussed at the summit:
Researchers collect tissue samples from two groups—for instance, healthy cells and cancerous cells.
Messenger RNA (mRNA) is isolated from the samples. This mRNA represents the genes that are actively being expressed in the cells.
The mRNA is reverse-transcribed into complementary DNA (cDNA) and tagged with fluorescent dyes (e.g., red dye for cancer cells, green dye for healthy cells).
The labeled cDNA samples are mixed and applied to a microarray slide. The cDNA strands bind to their complementary DNA probes attached to the slide.
A laser scanner excites the fluorescent dyes, and the resulting light intensity is measured. The color and brightness at each spot on the array indicate whether a gene is over-expressed (red), under-expressed (green), or unchanged (yellow) in the diseased sample compared to the healthy one.
After processing the raw data, researchers would identify a list of genes that are differentially expressed. The power of bioinformatics, however, lies in going beyond a simple list.
| Gene ID | Gene Name | Function | Expression Change (Fold) | Statistical Significance (p-value) |
|---|---|---|---|---|
| Gene A | TP53 | Tumor suppressor | +5.2 | p < 0.001 |
| Gene B | MYC | Regulates cell growth | +3.8 | p < 0.005 |
| Gene C | BRCA1 | DNA repair | -4.5 | p < 0.001 |
| Gene D | EGFR | Cell division signal | +6.1 | p < 0.001 |
The real scientific importance emerges from the biological interpretation. For example, the simultaneous up-regulation of EGFR and MYC alongside the down-regulation of BRCA1 might suggest a coordinated cellular program driving rapid, potentially error-prone cell division. Researchers would use pathway analysis tools to see if these genes belong to a common biological pathway, such as cell cycle regulation or apoptosis (programmed cell death), thereby moving from a list of genes to a functional understanding of the disease mechanism 3 .
| Affected Pathway | Number of Genes Altered | Biological Process | Potential Therapeutic Implication |
|---|---|---|---|
| Cell Cycle Regulation | 12 | Control of cellular division | Target for stopping uncontrolled growth |
| Apoptosis | 8 | Programmed cell death | Restore cell death in cancer cells |
| DNA Repair | 5 | Maintenance of genomic integrity | Sensitivity to certain DNA-damaging drugs |
The experiments discussed at the summit relied on a combination of biological materials, cutting-edge technology, and sophisticated software.
| Tool/Resource | Category | Function & Explanation |
|---|---|---|
| Microarray Chips | Laboratory Technology | Glass slides with thousands of microscopic DNA spots used to measure gene expression levels across the genome. |
| High-Throughput Sequencers | Laboratory Technology | Instruments that automate the process of determining the order of nucleotides in DNA, generating massive datasets. |
| Reference Genomes | Database | A curated, high-quality digital DNA sequence of an organism, serving as a standard for comparison in genomic studies. |
| Gene Ontology (GO) Database | Database | A standardized vocabulary that describes gene functions and attributes, allowing for consistent analysis across species. |
| STEM & DREM | Software Algorithm | Tools for analyzing time-series gene expression data, identifying significant patterns and dynamic regulatory events 1 . |
| REDCap | Software Framework | (Research Electronic Data Capture) A secure web application for building and managing surveys and databases in clinical research 1 . |
| Supercomputing Resources (e.g., ORNL's Jaguar) | Computational | High-performance computing systems essential for processing the enormous computational load of bioinformatics analyses 1 . |
Tools for generating biological data including microarrays and sequencers.
Collections of biological information for reference and comparison.
Computational tools for analyzing and interpreting biological data.
The 2008 UT-ORNL-KBRIN Bioinformatics Summit was more than just an academic conference. It was a testament to the power of collaborative, interdisciplinary science.
The discussions there—from the fundamental redefinition of a gene to the application of supercomputing for analyzing molecular machines—have had a lasting impact 1 .
The summit concluded with a forward-looking vision, planning for future meetings to focus on translational informatics, epigenetics, and new technological advances 1 . The pioneering work shared in 2008 underscored a central theme that remains true today: the path to understanding the complex tapestry of life is through the seamless integration of biological inquiry and computational innovation. The researchers gathering by that Kentucky lake were not just sharing data; they were helping to build the lens through which we now view the inner workings of biology.
The legacy of the 2008 summit continues to influence bioinformatics research today, with many of the approaches and tools discussed having evolved into standard practices in modern genomic research.