Protein complex prediction via dense subgraphs and false positive analysis
Author
dc.contributor.author
Hernández, Cecilia
Author
dc.contributor.author
Mella, Carlos
Author
dc.contributor.author
Navarro, Gonzalo
Author
dc.contributor.author
Olivera Nappa, Álvaro
Author
dc.contributor.author
Araya, Jaime
Admission date
dc.date.accessioned
2018-06-29T14:34:02Z
Available date
dc.date.available
2018-06-29T14:34:02Z
Publication date
dc.date.issued
2017
Cita de ítem
dc.identifier.citation
Plos One, 12(9): e0183460
es_ES
Identifier
dc.identifier.other
https://doi. org/10.1371/journal.pone.0183460
Identifier
dc.identifier.uri
https://repositorio.uchile.cl/handle/2250/149342
Abstract
dc.description.abstract
Many proteins work together with others in groups called complexes in order to achieve a
specific function. Discovering protein complexes is important for understanding biological
processes and predict protein functions in living organisms. Large-scale and throughput
techniques have made possible to compile protein-protein interaction networks (PPI networks),
which have been used in several computational approaches for detecting protein
complexes. Those predictions might guide future biologic experimental research. Some
approaches are topology-based, where highly connected proteins are predicted to be
complexes; some propose different clustering algorithms using partitioning, overlaps
among clusters for networks modeled with unweighted or weighted graphs; and others
use density of clusters and information based on protein functionality. However, some
schemes still require much processing time or the quality of their results can be improved.
Furthermore, most of the results obtained with computational tools are not accompanied
by an analysis of false positives. We propose an effective and efficient mining algorithm
for discovering highly connected subgraphs, which is our base for defining protein complexes.
Our representation is based on transforming the PPI network into a directed acyclic
graph that reduces the number of represented edges and the search space for
discovering subgraphs. Our approach considers weighted and unweighted PPI networks.
We compare our best alternative using PPI networks from Saccharomyces cerevisiae
(yeast) and Homo sapiens (human) with state-of-the-art approaches in terms of clustering,
biological metrics and execution times, as well as three gold standards for yeast and
two for human. Furthermore, we analyze false positive predicted complexes searching the
PDBe (Protein Data Bank in Europe) database in order to identify matching protein complexes
that have been purified and structurally characterized. Our analysis shows that
more than 50 yeast protein complexes and more than 300 human protein complexes
found to be false positives according to our prediction method, i.e., not described in the
gold standard complex databases, in fact contain protein complexes that have been characterized
structurally and documented in PDBe. We also found that some of these protein
complexes have recently been classified as part of a Periodic Table of Protein Complexes.
es_ES
Patrocinador
dc.description.sponsorship
Basal funds FB0001 Conicyt,
Chile, and Fondecyt 1141311 Conicyt, Chile.