Knowledge-based approaches use the statistics collected from protein data-bank structures to estimate effective interaction potentials between amino acid pairs. Empirical relations are typically employed that are based on the crucial choice of a reference state associated to the null interaction case. Despite their significant effectiveness, the physical interpretation of knowledge-based potentials has been repeatedly questioned, with no consensus on the choice of the reference state. Here we use the fact that the Flory theorem, originally derived for chains in a dense polymer melt, holds also for chain fragments within the core of globular proteins, if the average over buried fragments collected from different non-redundant native structures is considered. After verifying that the ensuing Gaussian statistics, a hallmark of effectively non-interacting polymer chains, holds for a wide range of fragment lengths, although with significant deviations at short spatial scales, we use it to define a 'bona fide' reference state. Notably, despite the latter does depend on fragment length, deviations from it do not. This allows to estimate an effective interaction potential which is not biased by the presence of correlations due to the connectivity of the protein chain. We show how different sequence-independent effective statistical potentials can be derived using this approach by coarse-graining the protein representation at varying levels. The possibility of defining sequence-dependent potentials is explored.

Statistical potentials from the Gaussian scaling behaviour of chain fragments buried within protein globules

Zamuner S.;Seno F.;Trovato A.
2022

Abstract

Knowledge-based approaches use the statistics collected from protein data-bank structures to estimate effective interaction potentials between amino acid pairs. Empirical relations are typically employed that are based on the crucial choice of a reference state associated to the null interaction case. Despite their significant effectiveness, the physical interpretation of knowledge-based potentials has been repeatedly questioned, with no consensus on the choice of the reference state. Here we use the fact that the Flory theorem, originally derived for chains in a dense polymer melt, holds also for chain fragments within the core of globular proteins, if the average over buried fragments collected from different non-redundant native structures is considered. After verifying that the ensuing Gaussian statistics, a hallmark of effectively non-interacting polymer chains, holds for a wide range of fragment lengths, although with significant deviations at short spatial scales, we use it to define a 'bona fide' reference state. Notably, despite the latter does depend on fragment length, deviations from it do not. This allows to estimate an effective interaction potential which is not biased by the presence of correlations due to the connectivity of the protein chain. We show how different sequence-independent effective statistical potentials can be derived using this approach by coarse-graining the protein representation at varying levels. The possibility of defining sequence-dependent potentials is explored.
2022
File in questo prodotto:
File Dimensione Formato  
PLoSONE22.pdf

accesso aperto

Descrizione: Articolo principale
Tipologia: Published (publisher's version)
Licenza: Creative commons
Dimensione 1.74 MB
Formato Adobe PDF
1.74 MB Adobe PDF Visualizza/Apri
PLoSONE22_supp.pdf

accesso aperto

Descrizione: Materiale supplementare
Tipologia: Altro materiale allegato
Licenza: Creative commons
Dimensione 3.22 MB
Formato Adobe PDF
3.22 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3439476
Citazioni
  • ???jsp.display-item.citation.pmc??? 0
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 1
social impact