Biocode Field Information Management System (Biocode FIMS) : Connecting Field Data To The Laboratory and The World

Biocode Field Information Management System (Biocode FIMS):
Connecting Field Data to the Laboratory and the World

John Deck, Information Services and Technology/Berkeley Natural History Museums
Neil Davies, Gump South Pacific Research Station
Capturing critical data elements at the source

The Biocode Field Information Management Systems (Biocode FIMS)
takes spreadsheet data that is generated in the field and validates it,
aligns it with global metadata standards, assigns unique identifiers,
and publishes a private or public version that can be referenced by
other applications such as collections management systems, Genbank,
and data harvesters, but especially featuring Laboratory Information
Management System Integration. Biocode FIMS is designed by
building on keys to good data linking: persistent identifiers and
alignment with standardized vocabularies and ontologies.
More Information

Information for interested users and developers is at:
http://code.google.com/p/biocode-fims
Field Data Collection
Collecting terrestrial invertebrates as part of the Moorea Biocode Project
Insect
specimen

KEY:
subclass of

has specified output

has specified input

instance of

derives from

BCO:material
sampling process
BCO:identificatio
n process
BCO:material
sample
OBI:sequencing
assay
OBI:sequence
data
Genbank
sequence B
TaxonID A

TaxonID B

Tissue
sampling
DNA
extraction
Identification
using key
Identification
using BLAST
Sequencing

Biocode
Sampling
Tissue
sample

DNA
molecules

BCO:taxonomic
name
rdfs:Class
Alignment with standardized vocabularies and
ontologies

Biocode FIMS links spreadsheet fields to standardized vocabularies
such as the Darwin Core (DwC) to describe events and specimens
and the Minimum Information of any type of Sequence (MIxS) to
describe genomic data. We are also working with the OBO
Foundry and the Ontology for Biomedical Investigations (OBI) to
describe logical relationships of sample-based biological data in a
new project called the Biological Collections Ontology (BCO)
(https://code.google.com/p/bco/).
A diagram showing how information is classified using the Biological Collections Ontology
Spreadsheet
Templates
Identifier Keys
by Project
Validation
Convert to RDF
Triples
Map
Spreadsheet to
Standards
Upload
Query sets of
spreadsheets
(graphs)
Inferencing
Setup
Data
submission
Query Return Data
Biocode FIMS Design

The following chart shows how information is generated and
organized logically in the Biocode FIMS database.
Persistent Identifiers for Samples

Assigning persistent identifiers for samples as they are isolated
from nature or sub-sampled from other material is a critical
component of the Biocode FIMS. As these events usually
happen in the field, we need an identifier solution that works in
the field while also ensuring that the identifier itself can resolve
for years to come. To handle this challenge, we have worked
together with the California Digital Library to develop an
identifier solution based on the EZID (http://n2t.net/ezid)
solution called Biocode Commons Identifiers
(http://biscicol.org/bcid/). These identifiers look a lot like digital
object identifiers (DOIs) but are built on the archival resource
key (ARK) model: http://n2t.net/ark:/21547/R2
Technical Details

Uses an XML configuration file to define validation rules for
spreadsheets, how fields are logically related, and project
codes to aid in assigning identifiers.
Stores spreadsheet data in a Fuseki TDB triplestore.
REST Service Framework integration with Biocode Commons
Identifiers
UI Available as a command-line tool and a Geneious Plugin.
Coded in Java
Code is open source and available under the Berkeley
Standard Distribution license at
http://code.google.com/p/biocode-fims
Laboratory Information Management System
Integration

Biocode FIMS is partnering with Biomatters, makers of the
Geneious software for analyzing field samples in the
laboratory using sequencing technologies. Integration of
Biocode FIMS data and tools is via a customized Geneious
plugin.
National Science Foundation Support from: Collaborative Research: BiSciCol Tracker: Towards a tagging and
tracking infrastructure for biodiversity science collections (DBI-0956426); Research Coordination Network for the
Genomic Standards Consortium (DBI-0840989); The National Evolutionary Synthesis Center (NESCent), NSF #EF-
0905606
Developed in conjunction with faculty
and staff affiliated with the Berkeley
Natural History Museums, UC Berkeley
Development of the first version of
Biocode FIMS supported by the Gordon
and Betty Moore Foundation
John Deck is a programmer affiliated with Information
Services and Technology and Berkeley Natural History
Museums. Contact is jdeck@berkeley.edu
Neil Davies is executive director of the UC Berkeley
Gump Station, in Moorea, French Polynesia. Contact is
ndavies@moorea.berkeley.edu

Biocode Field Information Management System (Biocode FIMS) : Connecting Field Data To The Laboratory and The World

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Biocode Field Information Management System (Biocode FIMS) : Connecting Field Data To The Laboratory and The World

Transféré par

Droits d'auteur :

Formats disponibles

Biocode Field Information Management System (Biocode FIMS):

Connecting Field Data to the Laboratory and the World

Vous aimerez peut-être aussi