Motivation: Repeat proteins form a distinct class of structures where folding is greatly simplified. Several classes have been defined, with solenoid repeats of periodicity between ca. 5 and 40 being the most challenging to detect. Such proteins evolve quickly and their periodicity may be rapidly hidden at sequence level. From a structural point of view, finding solenoids may be complicated by the presence of insertions or multiple domains. To the best of our knowledge, no automated methods are available to characterize solenoid repeats from structure. Results: Here we introduce RAPHAEL, a novel method for the detection of solenoids in protein structures. It reliably solves three problems of increasing difficulty: (i) recognition of solenoid domains, (ii) determination of their periodicity and (iii) assignment of insertions. RAPHAEL uses a geometric approach mimicking manual classification, producing several numeric parameters which are optimized for maximum performance. The resulting method is very accurate, with 89.5% of solenoid proteins and 97.2% of non-solenoid proteins correctly classified. RAPHAEL periodicities have a Spearman correlation coefficient of 0.877 against the manually established ones. A baseline algorithm for insertion detection in identified solenoids has a Q2 value of 79.8%, suggesting room for further improvement. RAPHAEL finds 1,931 highly confident repeat structures not previously annotated as solenoids in the PDB records.

RAPHAEL: Recognition, periodicity and insertion assignment of solenoid protein structures.

WALSH, IAN THOMAS;Minervini G;FERRARI, CARLO;TOSATTO, SILVIO
2012

Abstract

Motivation: Repeat proteins form a distinct class of structures where folding is greatly simplified. Several classes have been defined, with solenoid repeats of periodicity between ca. 5 and 40 being the most challenging to detect. Such proteins evolve quickly and their periodicity may be rapidly hidden at sequence level. From a structural point of view, finding solenoids may be complicated by the presence of insertions or multiple domains. To the best of our knowledge, no automated methods are available to characterize solenoid repeats from structure. Results: Here we introduce RAPHAEL, a novel method for the detection of solenoids in protein structures. It reliably solves three problems of increasing difficulty: (i) recognition of solenoid domains, (ii) determination of their periodicity and (iii) assignment of insertions. RAPHAEL uses a geometric approach mimicking manual classification, producing several numeric parameters which are optimized for maximum performance. The resulting method is very accurate, with 89.5% of solenoid proteins and 97.2% of non-solenoid proteins correctly classified. RAPHAEL periodicities have a Spearman correlation coefficient of 0.877 against the manually established ones. A baseline algorithm for insertion detection in identified solenoids has a Q2 value of 79.8%, suggesting room for further improvement. RAPHAEL finds 1,931 highly confident repeat structures not previously annotated as solenoids in the PDB records.
2012
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/2529433
Citazioni
  • ???jsp.display-item.citation.pmc??? 15
  • Scopus 26
  • ???jsp.display-item.citation.isi??? 24
social impact