How EU-BON Created a Universal Library for Europe's Nature
Imagine trying to understand a complex novel by reading only every tenth page. This is the challenge scientists have faced for decades when trying to protect Europe's biodiversity.
Critical information about species and ecosystems has been scattered across hundreds of institutions, locked in filing cabinets, incompatible digital formats, and isolated databases. Some records exist as handwritten notes from decades ago, while others reside in modern DNA sequencing databases, but they rarely speak to one another.
Enter EU BON (European Biodiversity Observation Network), an ambitious project that set out to solve this problem. In their Deliverable 1.3, they tackled a particularly challenging aspect: how to mobilize and integrate collection-based data—both physical specimens and DNA information—into a unified system that could power both science and policy. Their solution didn't just create another database; it built bridges between islands of information, creating what many now call a "universal library for Europe's nature."
Information scattered across hundreds of institutions in incompatible formats
Specimen records and DNA data existing in separate silos
Building bridges between isolated information sources
At its core, EU BON's approach recognizes that one size doesn't fit all in data integration. Just as you might use different strategies to organize a library of books, magazines, and digital media, EU BON employed multiple techniques to handle biodiversity's diverse data types.
The project implemented a sophisticated ETL (Extract, Transform, Load) pipeline 1 5 . This process involves:
Pulling data from its original sources—which could be anything from museum collection databases to modern DNA sequencing platforms
Converting it into standardized formats that follow common rules and vocabularies
Placing it into accessible systems where it can be discovered and used by researchers and policymakers
For DNA-based data, this presented special challenges. Genetic information often comes in specialized formats and requires specific metadata about sequencing methods and analysis techniques to be truly useful. EU BON developed approaches to make this DNA data interoperable with traditional specimen records, creating a comprehensive picture that connects physical specimens with their genetic blueprints 2 .
Perhaps the most innovative aspect of the system is its use of data virtualization 5 . Instead of forcing every institution to upload all their data to a central repository (which would be impractical and resource-intensive), EU BON created a virtual layer that allows users to access and query data from multiple sources as if they were in a single location.
Think of it like a universal search engine for biodiversity data—the information remains with its original custodians, but scientists can find and analyze it seamlessly. This approach respects the ownership and maintenance practices of data providers while dramatically increasing the accessibility of information for the research community.
| Integration Method | How It Works | Application in EU BON |
|---|---|---|
| ETL (Extract, Transform, Load) | Extracts data from sources, transforms to standard format, loads into central repository | Processing specimen records from museum collections for the central portal |
| Data Virtualization | Creates virtual access layer without moving original data; enables unified queries across sources | Allowing researchers to query distributed collections across Europe simultaneously |
| Middleware Integration | Uses specialized software as a bridge between different computer systems | Connecting modern databases with legacy systems in museum collections |
| API-Based Integration | Connects systems through programming interfaces for data exchange | Linking DNA sequence databases with specimen collection databases |
Extract
From multiple sourcesTransform
Standardize & cleanVirtualize
Create access layerQuery
Unified accessHow do you test whether a complex data integration system actually works? EU BON scientists designed a series of real-world validation experiments across multiple test sites in Europe 4 . One particularly revealing experiment focused on creating comprehensive profiles of selected species groups—combining historical distribution records from natural history collections with modern DNA-based observations.
The methodology followed these key steps:
Researchers selected multiple test cases involving species with both substantial specimen records in museum collections and available DNA sequence data in genetic databases.
They used the EU BON portal to query across all connected data sources simultaneously—from the Global Biodiversity Information Facility (GBIF) to specialized DNA databases 4 .
The system identified where information was missing or incomplete—for instance, species with specimen records but no genetic data, or modern DNA sequences that couldn't be linked to physical specimens.
By combining historical specimen data with contemporary observations, researchers tested the system's ability to visualize changes in species distributions over time.
The results were striking. For the first time, researchers could seamlessly trace species information across centuries—from a 19th-century museum specimen collected in the Alps to a modern DNA sequence obtained from the same region. The integrated system revealed previously invisible patterns, such as:
In response to climate change that were only detectable when combining long-term specimen records with modern observations
Where DNA evidence suggested that what was historically considered one species might actually be several
Where certain regions or species groups were dramatically under-represented, guiding future research efforts
That wouldn't have been possible by examining any single data source in isolation
| Data Category | Specific Sources | Integration Challenges | EU BON's Solution |
|---|---|---|---|
| Specimen Data | Natural history collections, museum records, herbarium sheets | Varied formats, historical terminology, physical-only access | Standardized data capture, digitization protocols, metadata enhancement |
| DNA-Based Data | Genetic sequences, DNA barcodes, genomic analyses | Specialized formats, technical metadata requirements, privacy concerns | Development of specialized connectors, standardized metadata schemes |
| Observational Data | Field observations, citizen science reports, monitoring programs | Varying quality standards, different taxonomic resolutions | Quality validation tools, taxonomic name resolution services |
Creating a unified system for biodiversity data requires both conceptual innovation and practical tools. EU BON's approach brought together a suite of technologies and methods that enabled the seamless flow of information from scattered sources to integrated knowledge.
The project recognized that effective data integration requires both technical infrastructure and community engagement. Beyond the software and systems, EU BON established standards, protocols, and training resources that enabled diverse institutions to contribute to and benefit from the integrated network.
For genetic data specifically, the system incorporated elements similar to those used in specialized Genetic Information Management Systems 2 , which handle the unique challenges of DNA-based information—from interpretation support to standardized reporting and digital delivery formats.
| Tool Category | Specific Solutions | Function in Data Integration |
|---|---|---|
| Data Interoperability Tools | Schema mapping tools, semantic mediators, ontology services | Translate between different data formats and terminologies used by various collections |
| Genetic Data Processors | DNA sequence interpreters, quality validation tools, metadata enhancers | Process raw genetic data into standardized, discoverable formats with enhanced metadata |
| Data Enhancement Tools | Taxonomic name resolution services, geospatial validation tools, metadata generators | Improve data quality and completeness by adding contextual information and correcting errors |
| Platform Connectors | API interfaces, HL7/FHIR protocol support, REST API endpoints | Enable different computer systems to communicate and share data seamlessly 2 |
Bridging the gap between different data formats and systems through schema mapping and semantic mediation.
Specialized tools for handling DNA sequences, quality validation, and metadata enhancement.
Improving data quality through taxonomic resolution, geospatial validation, and metadata generation.
APIs and protocols enabling seamless communication between different computer systems.
The true measure of EU BON's success lies not in its technical achievements alone, but in how these capabilities translate into real-world impact. By making integrated specimen and DNA data accessible through their European Biodiversity Portal 4 , the project has created a powerful resource for addressing pressing environmental challenges.
Policy makers now have access to comprehensive data for tracking progress toward international targets like the CBD's Aichi Targets and the UN Sustainable Development Goals 4 . Conservation planners can identify critical gaps in protected area networks by understanding both historical distributions and current genetic diversity. Researchers can trace the pathways of invasive species or monitor ecosystem responses to climate change with unprecedented resolution.
Perhaps most importantly, EU BON has helped democratize biodiversity information. Through citizen science gateways and accessible visualization tools 4 , the system enables everyone from professional scientists to concerned citizens to participate in and benefit from integrated biodiversity knowledge.
| Application Area | How Integrated Data Helps | Specific Example |
|---|---|---|
| Conservation Planning | Identifies priority areas based on comprehensive species distribution and genetic diversity data | Targeting conservation resources toward regions with high unique genetic diversity revealed by combined specimen and DNA data |
| Climate Change Response | Tracks species distribution shifts by combining historical specimen records with modern observations | Documenting range movements of alpine species in response to warming temperatures across Europe |
| Invasive Species Management | Provides early warning by integrating observation data from multiple sources and countries | Detecting and tracking the spread of invasive aquatic species across European watersheds |
| Policy Development and Reporting | Supplies comprehensive data for international reporting obligations | Supporting national reporting for the Convention on Biological Diversity and IPBES assessments |
The work begun in EU BON represents a crucial step toward a future where biodiversity information flows as freely as weather data does today. As the project's approaches and standards continue to be adopted and refined, we move closer to a world where:
Every decision about ecosystem management can be informed by comprehensive, integrated data
Conservation resources can be allocated based on a complete understanding of biodiversity patterns and processes
Scientific discoveries can accelerate by building on previously isolated information sources
European biodiversity data becomes seamlessly integrated with global observation networks