Show simple item record

Authordc.contributor.authorFarkas, Carlos
Authordc.contributor.authorMella, Andy
Authordc.contributor.authorTurgeon, Maxime
Authordc.contributor.authorHaigh, Jody J.
Admission datedc.date.accessioned2021-12-02T14:43:10Z
Available datedc.date.available2021-12-02T14:43:10Z
Publication datedc.date.issued2021
Cita de ítemdc.identifier.citationFrontiers in Microbiology June 2021 Volume 12 Article 665041es_ES
Identifierdc.identifier.other10.3389/fmicb.2021.665041
Identifierdc.identifier.urihttps://repositorio.uchile.cl/handle/2250/183018
Abstractdc.description.abstractAn unprecedented amount of SARS-CoV-2 sequencing has been performed, however, novel bioinformatic tools to cope with and process these large datasets is needed. Here, we have devised a bioinformatic pipeline that inputs SARS-CoV-2 genome sequencing in FASTA/FASTQ format and outputs a single Variant Calling Format file that can be processed to obtain variant annotations and perform downstream population genetic testing. As proof of concept, we have analyzed over 229,000 SARS-CoV-2 viral sequences up until November 30, 2020. We have identified over 39,000 variants worldwide with increased polymorphisms, spanning the ORF3a gene as well as the 30 untranslated (UTR) regions, specifically in the conserved stem loop region of SARSCoV- 2 which is accumulating greater observed viral diversity relative to chance variation. Our analysis pipeline has also discovered the existence of SARS-CoV-2 hypermutation with low frequency (less than in 2% of genomes) likely arising through host immune responses and not due to sequencing errors. Among annotated non-sense variants with a population frequency over 1%, recurrent inactivation of the ORF8 gene was found. This was found to be present in the newly identified B.1.1.7 SARS-CoV-2 lineage that originated in the United Kingdom. Almost all VOC-containing genomes possess one stop codon in ORF8 gene (Q27 ), however, 13% of these genomes also contains another stop codon (K68 ), suggesting that ORF8 loss does not interfere with SARSCoV- 2 spread and may play a role in its increased virulence. We have developed this computational pipeline to assist researchers in the rapid analysis and characterization of SARS-CoV-2 variation.es_ES
Patrocinadordc.description.sponsorshipSupercomputing infrastructure of the NLHPC ECM02 Research Manitoba CancerCare MB Research Foundationes_ES
Lenguagedc.language.isoenes_ES
Publisherdc.publisherFrontiers Mediaes_ES
Type of licensedc.rightsAttribution-NonCommercial-NoDerivs 3.0 United States*
Link to Licensedc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/us/*
Sourcedc.sourceFrontiers in Microbiologyes_ES
Keywordsdc.subject3´ UTR, SARS-CoV-2 variantses_ES
Keywordsdc.subjectNucleotide diversity (π)es_ES
Keywordsdc.subjectTajima’s D-statistices_ES
Keywordsdc.subjectViral evolutiones_ES
Keywordsdc.subjectVCFes_ES
Títulodc.titleA novel SARS-CoV-2 viral sequence bioinformatic pipeline has found genetic evidence that the viral 3 ' untranslated region (UTR) is evolving and generating increased viral diversityes_ES
Document typedc.typeArtículo de revistaes_ES
dc.description.versiondc.description.versionVersión publicada - versión final del editores_ES
dcterms.accessRightsdcterms.accessRightsAcceso abiertoes_ES
Catalogueruchile.catalogadorcfres_ES
Indexationuchile.indexArtículo de publícación WoSes_ES


Files in this item

Icon

This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial-NoDerivs 3.0 United States
Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 United States