Prediction of retention times of proteins in hydrophobic interaction chromatography using only their amino acid composition
Artículo
Open/ Download
Publication date
2005-12-09Metadata
Show full item record
Cómo citar
Salgado, J. Cristián
Cómo citar
Prediction of retention times of proteins in hydrophobic interaction chromatography using only their amino acid composition
Abstract
This paper focuses on the prediction of the dimensionless retention time of proteins (DRT) in hydrophobic interaction chromatography (HIC) by means of mathematical models based, essentially, only on aminoacidic composition. The results show that such prediction is indeed possible. Our main contribution was the design of models that predict the DRT using the minimal information concerning a protein: its aminoacidic composition. The performance is similar to that observed in models that use much more sophisticated information such as the three-dimensional structure of proteins. Three models that, in addition to the amino acid composition, use different assumptions about the amino acids tendency to be exposed to the solvent, were evaluated in 12 proteins with known experimental DRT. In all the cases analyzed, the model that obtained the best results was the one based on a linear estimation of the aminoacidic surface composition. The models were adjusted using a collection of 74 vectors of aminoacidic properties plus a set of 6388 vectors derived from these using two mathematical tools: k-means and self-organizing maps (SOM) algorithms. The best vector was generated by the SOM algorithm and was interpreted as a hydrophobicity scale based partly on the tendency of the amino acids to be hidden in proteins. The prediction error (MSEJK) obtained by this model was almost 35% smaller than that obtained by the model that supposes that all the amino acids are completely exposed and 40% smaller than that obtained by the model that uses a simple correction factor considering the general tendency of each amino acid to be exposed to the solvent. In fact, the performance of the best model based on the aminoacidic composition was 5% better than that observed in the model based on the three-dimensional structure of proteins.
Quote Item
JOURNAL OF CHROMATOGRAPHY
Collections