GalaxyTrakr: a distributed analysis tool for public health whole genome sequence data accessible to non-bioinformaticians
Author
dc.contributor.author
Gangiredla, Jayanthi
Author
dc.contributor.author
Rand, Hugh
Author
dc.contributor.author
Benisatto, Daniel
Author
dc.contributor.author
Payne, Justin
Author
dc.contributor.author
Strittmatter, Charles
Author
dc.contributor.author
Sanders, Jimmy
Author
dc.contributor.author
Wolfgang, William J.
Author
dc.contributor.author
Libuit, Kevin
Author
dc.contributor.author
Herrick, James B.
Author
dc.contributor.author
Prarat, Melanie
Author
dc.contributor.author
Toro Ibaceta, Magaly Alejandra
Author
dc.contributor.author
Farrell, Thomas
Author
dc.contributor.author
Strain, Errol
Admission date
dc.date.accessioned
2022-01-10T21:11:46Z
Available date
dc.date.available
2022-01-10T21:11:46Z
Publication date
dc.date.issued
2021
Cita de ítem
dc.identifier.citation
BMC Genomics (2021) 22:114
es_ES
Identifier
dc.identifier.other
10.1186/s12864-021-07405-8
Identifier
dc.identifier.uri
https://repositorio.uchile.cl/handle/2250/183636
Abstract
dc.description.abstract
Background: Processing and analyzing whole genome sequencing (WGS) is computationally intense: a single
Illumina MiSeq WGS run produces ~ 1 million 250-base-pair reads for each of 24 samples. This poses significant
obstacles for smaller laboratories, or laboratories not affiliated with larger projects, which may not have dedicated
bioinformatics staff or computing power to effectively use genomic data to protect public health. Building on the
success of the cloud-based Galaxy bioinformatics platform (http://galaxyproject.org), already known for its userfriendliness
and powerful WGS analytical tools, the Center for Food Safety and Applied Nutrition (CFSAN) at the U.S.
Food and Drug Administration (FDA) created a customized ‘instance’ of the Galaxy environment, called GalaxyTrakr
(https://www.galaxytrakr.org), for use by laboratory scientists performing food-safety regulatory research. The goal
was to enable laboratories outside of the FDA internal network to (1) perform quality assessments of sequence
data, (2) identify links between clinical isolates and positive food/environmental samples, including those at the
National Center for Biotechnology Information sequence read archive (https://www.ncbi.nlm.nih.gov/sra/), and (3)
explore new methodologies such as metagenomics. GalaxyTrakr hosts a variety of free and adaptable tools and
provides the data storage and computing power to run the tools. These tools support coordinated analytic
methods and consistent interpretation of results across laboratories. Users can create and share tools for their
specific needs and use sequence data generated locally and elsewhere.
Results: In its first full year (2018), GalaxyTrakr processed over 85,000 jobs and went from 25 to 250 users,
representing 53 different public and state health laboratories, academic institutions, international health laboratories,
and federal organizations. By mid-2020, it has grown to 600 registered users and processed over 450,000 analytical
jobs. To illustrate how laboratories are making use of this resource, we describe how six institutions use GalaxyTrakr
to quickly analyze and review their data. Instructions for participating in GalaxyTrakr are provided.
Conclusions: GalaxyTrakr advances food safety by providing reliable and harmonized WGS analyses for public
health laboratories and promoting collaboration across laboratories with differing resources. Anticipated
enhancements to this resource will include workflows for additional foodborne pathogens, viruses, and parasites, as
well as new tools and services.
es_ES
Patrocinador
dc.description.sponsorship
Center for Food Safety and Applied Nutrition at the U.S. Food and Drug Administration
es_ES
Lenguage
dc.language.iso
en
es_ES
Publisher
dc.publisher
BMC
es_ES
Type of license
dc.rights
Attribution-NonCommercial-NoDerivs 3.0 United States