A multilevel factor approach for the analysis of CDS commonality and risk contribution

We introduce a novel multilevel factor model that allows for the presence of global and pervasive factors, local factors and semi-pervasive factors, and that captures common features across subsets of the variables of interest. We develop a model estimation procedure and provide a simulation experiment addressing the consistency of our proposal. We complete the analyses by showing how our multilevel model might explain on the commonality across CDS premiums at the global level. In this respect, we cluster countries by either the Debt/GDP ratio or by sovereign ratings. We show that multilevel models are easier to interpret compared with factor models based on principal component analysis. Finally, we experiment how the multilevel model might allow the recovery of the risk contribution due to the latent factors within a basket of country CDS.


Introduction
Common factors capture the most relevant time variation in that it is spread across the variables in a panel. Due to relevance and practicality when treating the problem of dimensionality in large data sets, factor models have been extensively used in finance and macroeconomics. For estimation and inferential theory under different frameworks, see among many others, Forni et al. (2000), Stock and Watson (2002), Bai and Ng (2002), Bai (2003), Forni et al. (2004), and Forni et al. (2005).
As a standard practice, most empirical studies rely on the existence of pervasive factors, that is, factors that spread over and explain the full cross-section of analysed variables. One of the most common approaches to recover factors is 1 the principal component analysis (PCA), mainly in situations when N increases in contrast with a the state space setup that is used when N is small, Bai (2003). However, in recent years, multilevel factor structures have attracted attention in either theoretical or empirical research. The cornerstone of these models is to decompose the common factor structure into different levels, with factors associated to the full cross-section, i.e. pervasive, and factors that impact on and explain only a specific subgroup of variables, the non-pervasive factors. Such a possibility might become extremely relevant, as Boivin and Ng (2006) point out that factors that have stronger loadings on some specific groups of series than others may lead to biased or even inconsistent principal components estimates for the unobservable factors.
The literature includes different approaches leading to multilevel factor models. One of the possible choices implies a number of zero restrictions in the associated loadings matrix, as discussed in Wang (2010), Choi et al. (2018), and Breitung and Eickmeier (2016) for the stationary case, and Rodríguez-Caballero and Ergemen (2017) for the non-stationary case. In such proposals, the factor structure is split into pervasive and non-pervasive common factors that are usually identified as global and regional (or local) factors, as the application of these models usually split data on a geographical basis as in Kose et al. (2003). Global factors affect all variables in the panel, whereas regional ones affect only those in the respective region.
In this paper, we extend the literature of multilevel factor models by considering a new scenario in which the factor structure is more detailed with respect to the impact of factors on groups of variables. In this way, we are able to characterise both within-group and between-group variations. Thus, our factor structure allows for some interactions among groups or regions that do not contain information of the remaining groups. In turn, this allows disentangling the role of pervasive factors, the global ones, from the role of regional factors, active in a single local area or in a single group of variables, and, in the meantime, identifying factors associated with more than a single block. The latter factors are not pervasive in the cross-sectional dimensions, but allow for common behaviours across a subset of the blocks. We might refer to these interaction components as semi-pervasive factors.
The estimation method we propose for our novel multilevel factor model is similar in spirit to the sequential least square procedure proposed by Breitung and Eickmeier (2016). We employ a successive procedure of canonical correlation analysis (CCA) and PCA to obtain initial values for all the factors involved in the model structure. Then, a sequential least square procedure is used to estimate each one of the unobservable common factors. The different levels of factors are orthogonal to each other to ensure that the effects of a specific level of factors do not leak into other level. We also assess the methodology proposed in relatively small samples by Monte Carlo simulations through which we show that the method correctly identify the factors.
We accompany the methodological advancement with an empirical analysis on the credit default swap (CDS) market. In particular, we analyse the multilevel factor structure that characterise a panel of sovereign CDS spreads for more than 50 countries with the purpose of identifying the sources of commonality and how these have an impact on the risk of a CDS portfolio. Our empirical study is thus related to the work of Longstaff et al. (2011), which also discusses the presence and sources of commonality across sovereign CDS spreads, but limits the analysis to the use of principal components. Our purpose is to show that the multilevel factor model might provide a different viewpoint by associating commonality to latent factors capturing different country features. In fact, the multilevel factor model builds on grouping criteria among the target variables, which are needed to disentangle the role of global factors from that of local factors, where the latter are group-specific, and further introduces semi-pervasive factors capturing across-groups patterns. In our case, we show that a multilevel model based on country groupings build upon the Debt/GDP ratio or the sovereign rating provide views that differ from those associated with principal components, and that are more easily economically interpreted. The analysis of commonality associated with economically based country groupings and the adoption of a multilevel model is new in the literature. A further work linked to our analysis is that of Fabozzi et al. (2016), who employ PCA and independent component analysis over a collection of European sovereign spreads. Their purpose is to evaluate the evolution of risk in the CDS market building on the role of the latent factors. In our work, to further highlight the different views provided by multilevel models and principal components, we take a step in a direction close to that of Fabozzi et al. (2016), still focusing on the risk dimension, but from a different angle. In fact, we will show which is the role of principal components and latent factors extracted from a multilevel model in generating the risk of portfolios built with sovereign CDS. In this way, we highlight that multilevel models allow for a more detailed evaluation of the risk drivers, and show that grouping countries according to a given economic criteria has a crucial role. To the best of our knowledge, we are the first to take this analysis from a risk contribution perspective, using the tools introduced by Roncalli and Weisang (2016).
The analysis of different roles of global and local factors in the CDS spreads market is not novel, and has attracted considerable interest over the last decade. Longstaff et al. (2011) find a relevant role for global factors, as opposed to countryspecific, or local, factors by using principal components. Ang and Longstaff (2013) decompose the CDS spread of European countries and U.S. states into systematic (i.e. common) and idiosyncratic (i.e. local) and use the decomposition to compare the two economic areas. Augustin and Tedongap (2016) build again on principal components to show the relevance of U.S.-based variables as global drivers of sovereign CDS spreads. Fender et al. (2012) adopt a different modelling framework for analysing the role of global and regional risk drivers. Overall, the use of principal components is widespread in studies dealing with the identification of CDS risk factors and in the analysis of commonality. A closely related study, though based on a different approach, is that of Kocsis and Monostori (2016), which uses a hierarchical model to disentangle global and local factors. However, none of the previous studies allow for the existence of semi-pervasive factors as in our multilevel model. In our analyses, we contrast the multilevel outcomes with PCA and show that the two approaches lead to the identification of latent factors with a different exposure to macroeconomic and financial drivers. The different key is that the factors coming from PCA are more homogeneous in the exposure to economic drivers than the factors associated with country groupings and extracted from a multilevel model. This opens the door for the possibility of an easier identification and interpretation, from an economic viewpoint, of latent factors.
The remainder of the paper is organised as follows. Section 2 introduces the model and the estimation methodology used, while Section 3 presents a finite sample study based on Monte Carlo simulations. Section 4 describes the data and the approaches for country grouping, while Section 5 discusses the features of the multilevel-based factor and contrasts them with PCA. Section 6 presents the risk contribution analysis. Section 7 performs subsample analyses. Section 8 presents the conclusion of this paper.

A multilevel model with subgroup factors
We consider a block-structured factor model in which the unobservable common factor structure may be classified in many different levels according to the number of blocks formed by data. Our model differs from standard multilevel factor models recently used in the literature because we allow for interactions among blocks in contrast with the standard two-level approach that assumes only global and regional factors.
For clarity of exposition, consider a panel data formed by three different blocks of data, B 1 , B 2 , and B 3 . A general factor structure may be formed by the global factor, G t , which is a top-level pervasive factor that affects all blocks of data, the pairwise factors, F kj,t = F 12,t , F 13,t , F 23,t , which are sublevel pervasive factors that affect only the blocks (B 1 , B 2 ) in the case of F 12,t , for instance. Finally, the factor structure is also formed by the block-specific factors, F k,t = F 1,t , F 2,t , F 3,t , which are the non-pervasive factors that affect only a particular block. Such factor structure is displayed in the Venn diagram in Figure  1.
Then, the three-block factor model is written as where k = 1, 2, 3 indicates the block, index i = 1, . . . , N k denotes the i th crosssection unit of block k, t = 1, . . . , T is the time dimension, and kj means interaction between blocks k and j ∈ 1, 2, . . . , k with k = j. The total number of cross-sectional units is N = N 1 + N 2 + N 3 . Unobservable common factor structure is decomposed as discussed before: the r G × 1 vector G t = (g 1,t , . . . , g r G ,t ) contains the r G unobservable global factor, the r F kj × 1 vector F kj,t = f kj,1,t , . . . , f kj,r F kj ,t contain the pairwise block factors that interact only between blocks k and j with k = j, and the vector r F k × 1 vector F k,t consists of the r F k unobservable block-specific factor of block k. γ k,i , κ kj,i , and λ k,i are the r G , r F kj , and r F k -dimensional factor loadings. The number of global, pairwise, and block-specific factors as well as the cross-sectional dimension can naturally vary in each block k. The idiosyncratic term denoted by u k,it satisfies, for our purposes, the standard assumptions of an approximate factor model, see Bai (2003) for the standard case or Wang (2010) and Choi et al. (2018) for the multilevel case. However, the model may also allow for long-range dependence processes from which the common component and the idiosyncratic components should follow the assumptions provided by Rodríguez-Caballero and Ergemen (2017). We can rewrite (1) for the three blocks in a system way as where The system is written in a matrix form as where the dimension of Y, F * , and Λ * being T × N , T × r H , and N × r H , respectively, with r H = r G + r F 12 + r F 13 + r F 23 + r F 1 + r F 2 + r F 3 defining the total number of unobservable common factors.
Following ideas in Wang (2010), Choi et al. (2018), and Breitung and Eickmeier (2016), the factors loadings are identified up to a linear transformation of the following form, where to identify the factors, it is necessary to adapt usual normalisations as in PCA. First, orthonormal global, pairwise, and block-specific factors are given by And third, the matrix A in (3) imposes that the blocks among global, pairwise, and block-specific factors are orthogonal with each other, implying that . . , 6) and b j = (0, 1 . . . , 6). The model specified by (1) can be extended to more than three blocks, although the complexity of the model as well as the number of restrictions for identification, as in (3) for the case of three blocks, will naturally increase with the number of blocks involved.

Estimation
The estimation procedure is based on the sequential approach proposed by Breitung and Eickmeier (2016) in which the main goal is to minimise the residual sums of square (RSS) function by a sequence of two least-squares regressions until RSS achieves a minimum. The algorithm can be executed for the general case of k blocks as follow: 1. The algorithm starts by obtaining the initial values of the unobservable factors following the next strategy: a) We emply canonical correlation analysis (CCA) on y k,it to obtain the initial estimator of the global factor, t in each block k to filter out the global component. Then, we get the residuals, y * (0) ki,t , from each of the k separate regressions. c) We again employ CCA on y * (0) k,it to obtain the next lower level block factors. d) Then, we regress y * (0) k,it on the respective block factors involved and get the residuals. e) Steps c) and d) are sequentially executed until the initial estimates of the pairwise block factors are obtained. We denote y * * (0) k,it as the residuals after filtering the pairwise factors on each block k.
f) Then, we run PCA on y * * (0) g) Once initial estimators are obtained, the loading factors at the initial step are estimated from time-series regression of y ki,t on the factors involved in each specific block k. Consequently, the factor loading matrixΛ * (0) is constructed.
2. The updated estimator for the unobservable common factors are obtained by a sequential procedure based on last reasoning and is executed as follow: t in each block k to filter out the global component. Get y * (1) k,it . c) Run least-square y * (1) k,it on the next lower level block factors. d) Repeat the same procedure as before until getting block-specific fac- e) Next, the updated (and normalised) factors F 4. The last step consists in orthogonalising each level of factor estimates with respect to the remaining factors. Such orthogonalisation can be sequentially executed as before. Since all factors are orthogonalised with each other, we can now perform a correct variance decomposition of individual variables with respect to each factor.

Monte Carlo Simulation
In this section, using a Monte Carlo study, we examine the finite-sample properties of the estimation procedure proposed in Section 2.1. We focus on the case of three blocks for simplicity.
In our Monte Carlo study, the model in (1) is generated with three independent blocks, k = 3, with N k ∈ {20, 50, 200} cross-sectional units in each block, and T ∈ {150, 1500, 5000} sample sizes. We consider for simplicity only one global factor, G t , one factor in each pairwise block F 12,t , F 13,t , F 23,t , and one block-specific factor in each block F 1,t , F 2,t , F 3,t . All the unobservable factors are generated by a stationary AR(1) process where autoregressive parameters are taken as 0.5 with variances σ 2 ∈ {1, 2} to study the relative impact of a specific level of factor to the remaining ones. Furthermore, we consider that idiosyncratic terms u k,it iid ∼ N (0, 2φ) with φ controlling the signal-to-noise ratio with φ = {5, 2, 0.5}, corresponding to low, medium, and high signal-to-noise ratios. All loading factors are generated as N (1, 1) following Boivin and Ng (2006). In each experiment, we regress the actual factors on the estimated ones to evaluate the reliability of the procedure and compute coefficient of determinations for the global, pairwise, and block-specific factors denoted as R 2 G , R 2 F 12 , R 2 F 13 , R 2 F 23 , R 2 F 1 , R 2 F 2 . and R 2 F 3 , respectively. These coefficients can be considered as a measure of consistency for all t, see Bai (2003). We compare the results obtained after applying the methodology proposed in this paper with the global and regional factors estimated by applying the methodology proposed by Breitung and Eickmeier (2016) in their two-level factor model. All simulations are based on 1000 replications of the model. Table 1 presents the results.
As can be seen from Table 1, the methodology proposed in this paper performs well in relatively small samples (N k = 20, T = 150) and performs very well when sample sizes increase independently of size distortions between N k and T . A low signal-to-noise ratio (rows with φ = 5) makes the factors independently of the level somewhat less precise, although such loss of precision is not dramatic. Furthermore, it seems that changes in the variances in the factors do not have a considerable impact in the estimation of the factors. Finally, we observe that in cases when a factor structure consists also of some pairwise factors, neglecting the existence of such factors, as in the two-level model proposed by Breitung and Eickmeier (2016), the estimation of global and regional factors will be substantially biased, performing very poorly even when the cross-section and time series dimensions considerably increase. In this sense, we can conclude that one should be cautious when analysing a panel data set consisting of different blocks of data.

Economic data and country grouping
The data set used in this paper collects the 5-year credit default swaps (CDS) premia on government bonds. We download the data from the Thomson Reuters Datastream database. Our dataset is a balanced panel consisting of 53 countries for each day for the period 1 January 2009 to 11 December 2017, yielding a total of 2,333 daily observations for each country. In our analyses, we work with the standardised log-changes of the CDS premia. Table 2 shows the countries included in the analysis as well as the descriptive statistics of the associated CDS logarith-  (20,50,200), and k = 3. The measure of consistency of the unobservable factors estimated are presented in the report. Methodology proposed in this paper Breitung and Eickmeier (2016)  Methodology proposed in this paper Breitung and Eickmeier (2016)

Notes:
The DGP is y 20,50,200) and T ∈ (150, 1500, 5000). u k,it iid ∼ N (0, 2φ) are independently generated with φ controlling the signal-to-noise-ratio with φ = {5, 2, 0.5}. Only one top-level factor, one pairwise factor in each pairwise block, and one block-specific factor in each block are considered. Gt = 0.5G t−1 + wt with wt ∼ IIDN (0, σw) and σw ∈ (1, 2), F kj,t = 0.5F kj,t−1 + νt with νt ∼ IIDN (0, σν ) and σν ∈ (1, 2), and F k,t = 0.5F k,t−1 + vt with vt ∼ IIDN (0, σv) and σv ∈ (1, 2). R 2 G is the R 2 of a regression of actual on estimates global factor. The same coefficient of determinations are computed according to the case. All experiments are based on 1000 replications. mic returns. We stress that our dataset contains both developed countries as well as emerging countries. We define the panel composition in order to balance the cross-sectional dimension and the temporal coverage of the CDS data.
The adoption of a multilevel factor model requires the existence of groups, or clusters, among the variables of interest. We cluster the 53 countries in two different ways: i) using the median of the Debt/GDP ratio of at most 8 years (2009 -2017), and ii) clustering the countries by the last credit rating assigned by Standard & Poor's. Given these choices for clustering countries, we will evaluate the role of both the credit rating and the Debt/GDP ratios in driving the commonality across the countries included in our panel. In fact, the multilevel model will provide, apart from a global factor, a set of latent factors associated with single country groups and across sets of country groups.
When clustering the countries with respect to the median of Debt/GDP ratio, we identify three different groups: the first comprises the 16 countries with the highest ratios, (i.e. over 70%); the second group includes the 19 countries with a Debt/GDP ratio between 70% and 45%; the last group contains the remaining 18 countries with the lowest ratio, (i.e below 45%). Table 3 illustrates the countries clustered by the level of Debt/GDP ratio, and the median ratios used to identify the groups.
We then cluster the countries with respect to the credit rating assigned by Standard & Poor's in 2017. The first group is formed by the 16 countries with the highest rating (above A+), the second group includes 10 countries, those with a high-medium rating (from A-to A+), the third is formed by the 14 countries with medium-low rating (from BBB+ to BBB-), and the remaining 13 countries are allocated in the lowest rating group. Table 4 depicts the countries clustered by the S&P credit rating.
The two grouping criteria are based on different, though linked, economic quantities. To verify if the two groups are somewhat related, we run a simple association analysis between the two country classifications, where classes correspond to groups; Table 5 contains the result. We observe how the two clustering criteria do not provide associated classifications. We read this evidence as reflecting the different informative content of the two clustering criteria. On the one hand, this suggests that by fitting a multilevel factor model on the two different country classifications, we could observe dissimilar results and recover interesting, though not aligned, economic interpretations. On the other hand, this might call for a more general clustering criteria, based on statistical clustering approaches. We also consider this additional possibility in a preliminary set of analyses. However, the findings, with respect to those obtained from multilevel models based on our economically based clustering, were clearly inferior. We thus decided to not report these additional evidences in the paper, but to made available them upon request. The differences between the two clustering criteria could also suggest to cross them to recover a finer classification (thus including 12 groups). Despite this being of some relevance, the multilevel model that we would define will be characterised    Table 5: Frequency of countries over the two classification criteria -the last row reports the Chi-square test statistic for the null of no association between the two classification criteria for country grouping -the test statistic is distributed as a Chisquare with 6 degrees of freedom.

Global and economic-driven factors in CDS
We proceed with the estimation of the multilevel factor model on the CDS data. We consider two model specifications based on the two criteria used to classify the CDS data and described in the previous section. In detail, we fit a model for the Debt/GDP ratio classification and a model based on credit rating classification. Given that the two criteria lead to a different number of groups, the two fitted models will have a different number of factors.
In the following, in tables, figures as well as in the text, we will refer to groups by focusing either on the indicator level (Debt/GDP or rating) or on the factors. Table 6 shows the matching between the factors and the country groups. Overall, we have 15 factors in the rating case and 7 factors in the Debt/GDP case. The multilevel factor model for the Debt/GDP case exactly corresponds to the specification in equation 2, while the rating case has an equivalent, though more complex, structure. In the latter model, apart from the global factor and the group-specific, or local, factors, we have two collections of semi-pervasive factors: the first includes the four factors associated with the different sets of three country groups; and the second contains the six factors associated with pairs of groups.
In addition, for comparison purposes, we estimate the factor affecting the CDS evolution by PCA. To estimate the optimal number of unobservable factors, that is, the number of principal components to consider, we use the procedure proposed by Alessi et al. (2010) that improves the penalisation in the well-known criteria of Bai and Ng (2002). Such improvement is given by a tuning multiplicative constant that leads to heteroskedasticity robust inference. In our CDS dataset, we thus identify four factors, that is, we focus on the first four principal components.    By row, from top to bottom, left to right: loadings to the first factor; loadings to the second factor; loadings to the third factor; loadings to the fourth factor. Figure 6: Box plots of estimated coefficients (loadings) to the principal components for country groups based on the sovereign debt rating (High, High-Medium, Medium-Low, and Low levels). By row, from top to bottom, left to right: loadings to the first factor; loadings to the second factor; loadings to the third factor; loadings to the fourth factor.

Rating Factor/Group High High
Figures 2 to 4 report the box-plots for the estimated loadings across country groups. 1 To avoid scale effects due to the variances of the factors, the figures report loadings scaled by the volatility of the factors. We observe that for the Debt/GDP case, the global factor appears to impact in a more relevant way to Medium and Low Debt/GDP ratio groups rather than for the High ratio group. Irrespective of the group, the global factor positively impacts on the CDS returns, allowing us to safely interpret it as a market factor. Factor F12, impacting only on High and Medium Debt/GDP ratio groups, is more relevant for the High group compared with the Medium. Notably, factors F13 and F23 (impacting on High-Low, and Medium-Low groups, respectively) seems to be less relevant as the median loadings are rather small. Factor F23 (Medium-Low) has, in general, negative loadings to country CDS, while factor F13 (High-Low) positively impacts on CDS in most cases. Finally, the group-specific factors (F1, F2 and F3) show average loadings close to zero for the Medium and Low Debt/GDP ratio groups (F2 and F3, respectively) and positive loadings to CDS for the High Debt/GDP ratio group. From an economic viewpoint, the countries with High Debt/GDP ratio are those less exposed to the global factor, which is interesting as it possibly signals the existence of relevant differences between this group and the other countries. We also note that the High Debt/GDP ratio group is also more exposed to the High-Medium factor (F12), which might thus be largely driven by the behaviour of the high group, and is positively exposed to the group-specific factor. This strengthens our view that the High Debt/GDP ratio country group is characterised by different behaviours in the CDS dynamic compared with the Medium and Low Debt/GDP groups.
Moving to the rating case, we observe that results are more heterogeneous. The global factor has positive loadings to CDS rates with larger impact for the two central groups (the High-Medium and Medium-Low rating groups). For combined factors (impacting on two or three rating groups), it is complex to identify patterns, even if, in some cases, we might observe a predominance of the combined factors for specific rating-based country groups. Finally, group-specific factors seem more relevant now, as they are characterised by loadings with larger sizes. These evidences might be due to the more complex structure of the multilevel model. Notably, both low rating and high rating groups seems to be less exposed to the global factor, possibly as a consequence of the heterogeneity within the groups. This is also in line with the relevant role played by the group-specific factors.
Finally, we consider the principal component analysis. Similarly to the multilevel factor models, we analyse the box-plots of the loadings, grouping them by either principal components, Figure 4, or coherently with the fitted multilevel models, Figures 6 and 5. We observe that the first principal component has the largest and more stable loadings across all the CDS, without remarkable differences across groups. Further, if we analyse the loadings of the components from the second to the fourth across the country groups (Debt/GDP ratio and rating), we do not see patterns that allow us to match principal components with groups. Apart from the first principal component, all other principal components have loadings more dispersed. This signals that the principal components have a less clear connection with the country groups based either on the Debt/GDP ratio or the sovereign rating, while the multilevel model, given it is grounded on an economically based classification criteria, provide interpretable factors by construction.
Apart from the evaluation of the loading, a first comparison between the two multilevel factor models and the more traditional PCA, might build on the ability of the models in capturing the behaviour of the CDS returns. We monitor this aspect by focusing on the fraction of the variance explained by the estimated factors, Figure 7, and the correlation between model residuals, Figure 8. Similarly to the loadings case, we compare PCA results by grouping them according to the groups adopted in the multilevel models. The fraction of variance explained by the PCA and by the multilevel model in the Debt/GDP case are very close, while the model based on rating groups provides more interesting and heterogeneous results. This could be a consequence of the larger flexibility of the model, which is capable of capturing the behaviour of specific country groups. Notably, the extreme groups (High Debt/GDP ratio, Low and High rating) show a smaller fraction of explained variance. Again, we link this to the heterogeneity existing within the groups. Moving to the correlation analysis, the principal components seem to provide slightly better results compared with the multilevel models; in addition, the two multilevel factor models provide residual correlations that are very close.
Summarising our findings, by moving from principal components to multilevel models, we note relevant changes in the loadings to the latent factors despite the overall fit of the model being substantially equivalent. On the one hand, this challenges the potential benefits of the multilevel model, an aspect we will discuss in the following subsection. On the other hand, this highlights that a somewhat more detailed model built on economically based grouping criteria is expected to provide a better economic description of the relation among target variables (i.e. the CDS spreads) compared to a purely data-driven model.

Are principal components and latent factors different?
We now take a deeper look at the differences between the latent factors recovered by our approach and factors obtained by a more standard principal component analysis. We start with a simple comparison between the dominant factor within the principal component setting and the global factor from our two multilevel models. Figure 9 shows the scatterplots between them.
We clearly note a positive correlations in both cases, thus suggesting that the dominant factors in the various approaches are possibly capturing the same latent behaviour. Table 7 shows the correlation between the first four principal components and the latent factors based on the Debt/GDP ratio. Notably, the global factor, despite being highly correlated with the first principal component (correlation equals 0.86), has a significant, and negative, correlation with the second and third principal components, equal to -0.31 and -0.36, respectively. Furthermore, there is no one-to-one matching between the latent factors and the first four principal components, as there are, overall, 14 correlations that are above 0.2 (in absolute terms), 50% of the full set of correlations. Therefore, the latent factors seem to provide a different view on the country CDS with respect to the descriptive elements we might extract from PCA. The evaluation of Table 8 confirms this finding. Again, the first principal component and the global factor are highly correlated, the correlation coefficient equals 0.87, and the global factor is also negatively correlated with the second and third principal components, as in the Debt/GDP ratio case. Furthermore, there are 17 correlation coefficients that are above 0.2 in absolute terms. In addition, the factor F1, specific to the group with High rating, is negatively correlated to all the first four principal components. We can thus state that the two multilevel factor models provide latent factors differing from PCA, despite showing some common behaviour. To clarify in this respect, we report in Table 9 the correlations between the latent factors obtained from the two models. Notably, apart from the expected high correlation between the global factors, equal to 0.95, the subgroup factors do not show a clear match, confirming our intuition that the two approaches for country groupings provide different information.
Both the principal component approach and our multilevel model allow for analysing the commonality among CDS spreads. Both methods highlight the role of common latent factors, in particular the global one. However, the use of a multilevel model attributes the commonality not just to global factors or to country group-specific factors, but also to semi-pervasive factors that thus capture an intermediate commonality aspect among the CDS spreads. From a different viewpoint, we might interpret these semi-pervasive factors as linked to the heterogeneity of the target variables. While the global factors capture the overall patterns and the group-specific factor are associated with the common components within groups, the semi-pervasive factors capture the part of the homogeneity associated with the groups. Furthermore, while the principal components can be attributed ex-post, in a more relevant way, to groups of countries, the factors extracted from a multilevel model comes, by construction, from specific groups of countries. Consequently, the interpretation of the model outcomes is simpler and immediate.  Table 7: Correlation between principal components (on columns) and factors of a multilevel model based on Debt/GDP ratio (1 is High, 2 is Medium and 3 is Low).
To further analyse the differences between the factors extracted by PCA and those estimated with a multilevel model, we regress the factors on a set of worldrelated macroeconomic variables, all expressed in daily returns over the same sample period of our analyses. We recover all data from Thomson Reuters Datastream. Our first and second explanatory variables are the Dow Jones World Developed and the Dow Jones World Emerging equity indexes. They will allow tracking of the equity market impact on the sovereign CDS risk, separating the role of developed markets from that of emerging markets. The third regressor is the VIX index, a fear index that we use to proxy the global uncertainty in the equity markets. It captures the potential impact of financial market uncertainty on the sovereign bond risk. Then, we include the oil price, a proxy for the commodity risk, which might have an indirect impact on both the oil importing and oil exporting countries. The impact will be mediated by the real economic effects of oil price changes. The fifth variable we include is the U.S. nominal dollar broad index, a proxy for the currency risk, a further indirect driver of possible changes in the sovereign risk. Finally, the last four variables, all from FTSE, track the bond market 2 , a world composite total return index for sovereign bonds (all maturities) to track the bond market impact on the CDS spreads; the differential between the returns on the world total return index for bonds with 10 years maturity and the returns on the world total returns index for bonds with maturity from 1 to 3 years to track the impact of maturity risk on the CDS spreads; the differential between the returns on the world total return index for sovereign bonds with A rating and the returns on the world total returns index for bonds with AAA rating to monitor the sovereign credit spread role; the differential between the returns on the world total return index for big corporate   Table 9: Correlation between the factors of the two multilevel models. On columns the factors based on the Debt/GDP classification (High, Medium and Low) while on the rows the factors based on the country rating classification (High, Medium-High, Medium-Low and Low).
bonds with BBB rating and the returns on the world total returns index for big corporate bonds with AAA rating to monitor the corporate credit spread role. Tables 10 -12 report the estimated coefficients. The factors react in a significant way to several macroeconomic world drivers. However, finding specific patterns for the multilevel factors seems hard. The sovereign credit spread is more relevant, across factors, than the corporate credit spread. The term spread becomes more relevant in the rating classification case, a somewhat expected result, as the sensitivity to the maturity for a specific country might be linked to the rating for the country debt. Equity indexes are also relevant, with the Emerging index prevailing over the Developed index. Finally, the currency seems also to be of some relevance. Overall, the factors react to different subsets of macroeconomic drivers but without a clear economic intuition. We read this evidence as distinctive of the model that builds factors on the basis of specific country groupings, without a clear association with the global economic drivers we select. The case of principal components is similar, but now the factors react, at most, to the same set of drivers, in particular the second and third principal components, making it virtually impossible to read their exposures from an economic viewpoint. Summarising, we believe that the adoption of a multilevel model lead to factors that are at least partially associated with an economic intuition, the one behind grouping criteria.

A risk contribution analysis
To further explain the relation between the factors and the risk of sovereign CDS co-movements, we proceed to an analysis focusing on the risk dimension. In a related work, (Fabozzi et al., 2016) analyse the volatility of latent factors extracted from European CDS by means of PCA or by independent component analysis (ICA). Their purpose was to highlight the sources of risk and relate the risk changes to policies. In our case, we are interested in characterising the role played by the various latent factors, coming from the multilevel model, in explaining the risk of different country groups.
Therefore, we adopt a recent risk decomposition derived from the work of (Roncalli and Weisang, 2016). We start from a general linear factor model, where the covariance matrix of R t equals and where Σ R is the covariance of R t , B is the matrix of factor loadings, Σ F is the covariance matrix among the factors, and Σ ε is the residual covariance. For both multilevel factor models and for PCA-based factors, Σ F is a diagonal matrix.
Despite that the covariance decomposition in (5) allows analysing the role of each factor in explaining the variance of each element in R t , we prefer to work at an aggregate level, that is, by focusing on portfolios. As our purpose is to identify the risk drivers of sovereign CDS when countries are clustered according to a specific  is the FTSE all maturities sovereign bond index, Bond_W_10+_minus_1_3 is the differential in the yield to maturities of 10+ years and 1-3 years FTSE sovereign bond indexes, Bond_W_A_minus_AAA and Bond_W_C_BBB_minus_AAA are the differentialsin yield to maturities between FTSE corporate bond indexes of A rated and AAA rated bonds and BBB rated and AAA rated bonds. Robust Newey-West standard errors are reported in parentheses. * p<0.1; * * p<0.05; * * * p<0.01.  criteria, we chose to analyse the risk decomposition for equally weighted portfolios formed by the CDS of a specific country group. In particular, we consider four equally weighted portfolios when clustering countries according to the rating and three portfolios when clustering countries on the basis of the Debt/GDP ratio.

28
To measure the role played by each latent risk factor, we follow the approach of Roncalli and Weisang (2016) and perform a risk contribution decomposition in the presence of risk factors.
Start from the linear factor model for k assets and using m < k factors Then, consider a portfolio with weights vector equal to ω. The weights represent the portfolio exposure to the assets, and using the linear model we can recover the portfolio exposure to the factors, δ, as The exposure to the assets and the exposure to the factors are thus related by δ = B ω. Roncalli and Weisang (2016) suggest using a decomposition of ω introduced by Meucci (2007) where B + is the Moore-Penrose inverse of B andB + is any matrix that spans the left null space of B + . This decomposition links the portfolio weights to the portfolio factor loadings δ and to the loadings to a set of residual factors (δ). Using this decomposition, we recover a decomposition of the total risk for a given portfolio. We use the portfolio volatility as a risk measure and therefore set In the absence of risk factors, the total portfolio risk can be decomposed in the contribution coming from the different assets, RC i , with i = 1, 2, . . . k, with the following equalities (or Euler equation) Consequently, the risk contribution is equal to the product of the weight of an asset in the portfolio times the marginal risk of that asset. Moreover, the sum of risk contributions equals total risk.
In the presence of risk factors, and using (8), we have R (ω) = R δ,δ and the Euler equation is still valid and becomes The first term on the right-hand side of (11) is our quantity of interest, that is, the contribution to the total risk that we can attribute to the factors. The second term represents a residual risk component. For the first term, we can write where we have used (8). Given this result, when the risk measure is the portfolio volatility, the risk contribution due to factors equals (13) The risk decomposition for portfolio ω in the presence of risk factors thus becomes where the last term is the risk that cannot be explained by factors. The previous quantities might also be expressed in relative terms by standardizing all of them by the portfolio total risk R (ω). The decomposition of the portfolio risk into the risk contribution of factors and the residual risk allows identifying the role exerted by each factor, and measuring the overall relevance of the factors. If the residual risk accounts for a large fraction of the total portfolio risk, we might be facing a missing factor or we might read this as an evidence for inappropriate factor decomposition. This is particularly relevant from a portfolio point of view, where the residual risk should play a minor role due to diversification benefits.
In our setting, as the final purpose is to evaluate the role of the latent factors in driving the risk of the CDS over country groups, we decide to report the risk decomposition for country portfolios. The portfolios we consider are formed by all the countries belonging to one of the three groups by Debt/GDP ratio or to all the countries included in one of the four groups by sovereign debt rating. In all cases, we build portfolios with equal weights in order to highlight the trend within groups and to avoid overweighting single countries (which is possible due to the heterogeneity present within groups in terms of specific indicators that could have been used to define different weighting schemes). Tables 13 and 14 collect the results over the two country classifications, while Table 15 show the risk contribution due to principal components across the country portfolios based on either rating or Debt/GDP.
In the rating case, we note that the global factor is the most relevant risk contributor for all groups, and in particular for countries in the groups High-Medium and Medium-Low, where the factor accounts for 68% and 77% of the total risk, respectively. The risk contribution of the global factor is relatively lower in the High rating group, where it explains 41% of the total risk. Trivariate factors do have a relatively high contribution to the risk for specific country portfolios. The factor associated with High, High-Medium and Medium-Low rating groups (F123) has a relevant role for the High country rating group, explaining 23% of the total risk, and a significant role for the other two country groups (12% and 9%, respecitvely). The factors F134 and F234 are relevant for the High (F134), Medium-Low and Low groups (both F134 and F234) but with fractions of risk explained smaller than those observed for F123. The factor F124 is not significant at all. Bivariate factors are in general negligible in terms of risk contribution, apart for the two bivariate factors associated with the High and Medium-High (F12) and the Medium-Low and Low (F34) groups. The former has a minor relevance in explaining the risk of the two higher rating groups, while the latter is crucial in the risk of the Low rating group, spiking at the 15% of the risk explained. Group-specific factors assume a relevant role, in particular for higher rating groups, where they explain 29% and 15% of the total risk, respectively. Finally, the residual risk is very low, indicating that, from a risk perspective, the identified factors capture most of the risk in the country groups based on rating. The only exception is the lower rating group where the residual risk is 5.8%. This is completely different form what we observe in Table 15 where the residual risk, that is, the fraction of risk not explained by the principal components, is even 10 times higher than the fraction of unexplained risk in the multilevel model case. From an economic point of view, the evidences we have match those in terms of loadings, as we have a confirmation of the central role due to global factors, the relevant role of factors associated with extreme groups, and the importance of group-specific factors.
When looking at the risk decomposition for the Debt/GDP case, we note that all factors have now some impact on the various country portfolios. Moreover, the residual risk is higher than in the rating case, even if it remains at sensibly lower levels compared with the PCA risk contribution. Again, these results are in line with the findings associated with the loadings, but also highlight that the introduction of a finer classification of countries (four groups in the rating case versus the three groups in terms of Debt/GDP) seems to be more flexible in capturing the heterogeneity that characterises the countries as well as the country groups.
A final comment points at the PCA risk decomposition, where the first principal component has a central role, and is, overall, more relevant than the global factors extracted from the multilevel model. The second to the fourth components explain a limited fraction of the total risk, and the residual risk is much higher than in the multilevel model cases. This further confirms that by resorting to PCA we tend to assign a predominant role to the first component, which has an unclear economic intuition, apart from being closer to a weighed average of the variables (in our case, we do have positive loadings to the first component). Differently, a multilevel model allows identifying a collection of potentially relevant latent factors, which can surely be matched with groups of the variables. Consequently, if grouping is based on an economic criterion, we can associate factors to an economic intuition.
Overall, in our case, the risk contribution analysis highlight the role of latent factors for monitoring the risk of country groupings, in particular those based on ratings. The role of the global factor remains predominant, but we also highlight a relevant contribution from semi-pervasive factors associated with specific sets of country groups. Therefore, the adoption of economically based country grouping criteria in combination with a multilevel model could provide relevant insights in the analysis of the risk drivers of CDS spreads.

The relevance of multilevel factors during and after financial crises
We complete our analyses by looking at subsample results. Our data starts in 2009 and ends in 2017. In the first part of the sample, our data covers the last months of the global financial crisis as well as the European sovereign crises. Both these events had a relevant impact on the sovereign CDS market, see for instance, among many others, Caporin et al. (2017) and Caporin et al. (2018) and the cited references therein. Therefore, we run the previous analyses over two periods. The first, starting in 2009 and ending in 2012, and the second covering the years from 2013 to 2017. We comment here on the main findings; Appendixes A and B include all figures and tables. In terms of loadings to latent factors, we do observe some differences. If we consider the Debt/GDP country groupings, the loadings observed in 2009 -2012 for the global factor are higher than those recorded in 2013 -2017. A possible explanation is the increased heterogeneity in the fundamental countries that, in turn, leads to an increased heterogeneity in the CDS behaviour and a decrease in the response to movements in the global factor. We might see a confirmation of this view in the larger dispersion of the loadings to the group-specific factors we     The correlation of residuals from the various models as well as the correlation among latent factors is in line with the full-sample analysis: a relevant reduction in the correlation among residuals, with close results between the PCA and the multilevel models; a high correlation among global factors and between them and the first principal component; the absence of a matching between the other principal components and the latent factors as well as between latent factors (global excluded) from the two multilevel models.
More interesting findings emerge from the risk contribution analysis. In the rating case, we note a first relevant change in the role exerted by the factor F124 (it excludes the Medium-Low rating countries): in the first sample, it represents a relevant fraction of the risk contribution (up to 7% for the Medium-High group), while in the period 2013 -2017, its contribution is close to zero, coherently with the full-sample case. We motivate this change in country ratings across time with possible effects on the group composition and the subsequent factor interpretation. This is also in line with the second relevant modification we observe, associated with the increased role of the group-specific factor for the High rating countries. The role of the factor increases in the sample 2013 -2017, jumping to 50% of the risk contribution for the High rating countries. This signals a sort of separation effect induced by tension in the CDS market after the global financial crisis and the European sovereign crisis, with High rating countries being influenced mostly from their own shocks and less dependent on global and semi-pervasive factors. For the latter, for the High rating group, we observe a decrease of the risk contribution from 43% to 32% for the global factor and from 18% to 10% for factor F123 (excluding Low rating countries). Similar, but less clear effect, is observed fro the Medium-High rating group, where the group-specific factor shows an increase in the risk contribution from 4% to 23%. We observe similar patterns in the Debt/GDP grouping case. The global factor contribution to the risk decreases in a significant way for the High Debt/GDP group, moving from 45% in 2009 -2012 to 29% in 2013 -2017, but also for the other two groups, from 84% to 71% for the Medium group and from 80% to 67% for the Low Debt/GDP ratio group. Furthermore, the factor F12, focusing on the High and Medium groups, shows some decrease in the risk contribution for the High group (from 46% to 37%) but the other two semi-pervasive factors, and in particular the F23, record a relevant increase, as their risk contribution multiplies by three. Finally, the specific factor for the High group moves from a 4% contribution in 2009 -2012 to a 18% contribution in 2013 -2017, thus again suggesting a relevant role for shocks associated with the group members. Finally, we observe that the residual term, not attributed to some of the factors, sensibly increases in the most recent sample. This could signal, again, that the post-crisis period has a behaviour that partially differs from the 2009 -2012 period. In the principal component case, the risk contribution analysis over the two subsamples confirms the role of the first principal component, but highlight that its contribution to the overall risk reduces in the second sample, 2013 -2017, where the residual term increases and the second component has some role.
Finally, if we consider the regression of the latent factors on the set of economic drivers, we find a confirmation of the full sample outcome: while the principal component factors are more homogeneous in the exposure to economic drivers, the factors extracted by a multilevel model are heterogeneous in the exposures to exogenous world variables. This evidence holds in both subsamples.

Concluding remarks
Latent factor models represent a common tool in the analysis of macroeconomic and financial variables. In many cases, to estimate factors we resort to principal component analyses. However, this approach has some limitations, in particular for the economic interpretation of the latent factors. We introduced a special multilevel model to overcome this limitation by decomposing a collection of variables into mutually exclusive groups. The model used the groups to identify a set of pervasive, semi-pervasive, and group-specific latent factors. These factors turned out to be easier to interpret, thanks to their relation with known groups of variables. Further, the use of the multilevel model might provide an easier interpretation to the loadings and the identification of risk sources. We supported the model by simulations as well as by an empirical analyses based on global CDS spreads. The approach we put forward might be of interest in all areas in which the variables of interest can be easily classified according to extra-sample information.                           Table B6: Risk contribution of the PCA model on the total risk of equally weighted portfolios by country groups based on Debt/GDP or on country rating in the subsample 2013 -2017.