Given a particular membrane protein, it is very important to know which membrane type it belongs to because this kind of information can provide clues for better understanding its function. In this work, we propose a system for predicting the membrane protein type directly from the amino acid sequence. The feature extraction step is based on an encoding technique that combines the physicochemical amino acid properties with the residue couple model. The residue couple model is a method inspired by Chou’s quasi-sequence-order model that extracts the features by utilizing the sequence order effect indirectly. A set of support vector machines, each trained using a different physicochemical amino acid property combined with the residue couple model, are combined by vote rule. The success rate obtained by our system on a difficult dataset, where the sequences in a given membrane type have a low sequence identity to any other proteins of the same membrane type, are quite high, indicating that the proposed method, where the features are extracted directly from the amino acid sequence, is a feasible system for predicting the membrane protein type.

An ensemble of Support Vector Machines for predicting the membrane proteins type directly from the amino acid sequence

NANNI, LORIS;
2008

Abstract

Given a particular membrane protein, it is very important to know which membrane type it belongs to because this kind of information can provide clues for better understanding its function. In this work, we propose a system for predicting the membrane protein type directly from the amino acid sequence. The feature extraction step is based on an encoding technique that combines the physicochemical amino acid properties with the residue couple model. The residue couple model is a method inspired by Chou’s quasi-sequence-order model that extracts the features by utilizing the sequence order effect indirectly. A set of support vector machines, each trained using a different physicochemical amino acid property combined with the residue couple model, are combined by vote rule. The success rate obtained by our system on a difficult dataset, where the sequences in a given membrane type have a low sequence identity to any other proteins of the same membrane type, are quite high, indicating that the proposed method, where the features are extracted directly from the amino acid sequence, is a feasible system for predicting the membrane protein type.
2008
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/157745
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 37
  • ???jsp.display-item.citation.isi??? 34
social impact