Risk-Adjusted Models of Costs Referable to General Practitioners Based on Administrative Databases in the Friuli Venezia Giulia Region in Northern Italy

Setting: The 2007 administrative databases from the National Health Service of the Friuli Venezia Giulia Region were the data source. Data referred to the general population and included information on the use of health services (inpatient, outpatient, medication, home care) as well as on the major chronic health problems. Data included persons who, for their health condition, must not pay the contribution usually required for using health services (ticket exemption).


Introduction
A classification scheme of patients is a necessary tool for the evaluation of health services in primary health care (PHC). A classification of the persons who need assistance, rather than their illness, provides an instrument that accounts for all a patient's problems, the need for health services she or he expressed, and their evolution of the severity of the condition over time. In Europe, patient classification systems (PCS) are being used primarily to categorise hospital data, and applications in other settings are rare. Yet, it would be of equal or even of greater importance to focus on

Risk-Adjusted Models of Costs Referable to General Practitioners Based on Administrative Databases in the Friuli Venezia Giulia Region in Northern Italy
care management. The concept of episode of care has been elaborated in order to deal with output and outcome comparison, and it has led to the development of some classification schemes and commercial software [1][2][3]. However, these systems require clinical data of a quality not common in Europe. Additionally, data collection is often complex and expensive, and creates privacy problems.
The Italian NHS provides comprehensive health care coverage to the entire population through geographically-based organizations called local health organizations (ASL in Italian). ASLs offer primary, hospital, and outpatient care (all diagnostic procedures and tests and referrals to specialists) and medications, either directly or through contracted providers. A huge information system has been developed over time to support administration. Information on treatments received by any Italian citizen under the umbrella of the NHS is available to ASLs, regions, and the Ministry of Health. The information is contained in large administrative databases, which include also clinical information. In addition, any citizen receives a unique health identifier, which is always included in any health record. This allows, through a record linkage process, to describe all treatments she / he has received from the NHS.
Given the information available, it is easy to calculate output/ outcome indicators. However, there is no risk adjustment model that properly compares GPs, ASL, or health regions, which often assist population with a different Casemix. In the Italian NHS there is a convenient solution to this problem. Even if drugs and outpatient services are free of charge, the patient makes a small contribution (called a ticket). However, persons affected by severe conditions and chronic diseases, or low income, are exempted from the ticket. The list of the diseases and conditions that allow ticket exemption, based on ICD9CM classification, is set forth by a national law, and is reported in Table 1. Information on ticket exemptions is collected in administrative databases, including clinical information at the population level. Thus, ticket exemptions can be evaluated as a source of information on which to develop risk adjustment models.
The Region Friuli Venezia Giulia, in north-eastern Italy, has 1.2 million inhabitants, six ASLs, and eighteen hospitals. Complete and accurate records of the whole population and health services have been collected in networked administrative databases since 1970. This paper presents the results of a study that exploits these administrative databases to develop a risk adjustment model for PHC, using ticket exemption as a source of information on the health status of the population.

Material and Methods
Administrative databases concerning hospitalizations, medications, outpatient services, ticket exemptions, and the list of citizens associated with a GP in 2007 were made available by the Regional Health Authority of Friuli Venezia Giulia. All records included the unique health identifier. Children under fourteen were excluded, since they are usually (but not exclusively) registered with a family paediatrician. Also, data on drug costs of nursing home residents (about 6,500 people) was not included, since they receive medications through a different channel. All referrals from the GP to a specialist, as well as self-referred contacts by patients, are recorded in the outpatient database. The database includes all services obtained through the NHS, including visits to a medical specialist, lab tests, radiology, imagining, ambulatory surgery, and so on. All services provided in a hospitalization episode of any kind are excluded. Services received in the emergency room are included, if they were not followed by hospitalization.
An anonymous code was substituted for the personal identifier before data was released to the research team to protect patient privacy. Databases were then linked by this code to fully reconstruct goods and services received from the NHS.
The total individual daily tariff was selected as the dependent variable. It was obtained by summing up the prices of prescribed drugs and tariffs of outpatient and inpatient health care services received. Tariffs do not always correspond to real costs, but they allow an indirect measure of the burden involved with different types of services. They are the only means to reasonably compare different services (hospitalization episodes, lab tests, etc.).
The time unit was the number of days during which each citizen was registered with the same GP (usually 365 days; a shorter period in case of GP change or patient death during the year). The dependent variable was transformed by means of the inverse hyperbolic sine function, which, unlike the logarithmic transformation, allows management of all real numbers (in particular 0). Data analysed was structured hierarchically, as patients were grouped by their GP.
Data structure is characterized by information measured at two levels: patient level (level1) and GP level. The set of level1 variables included the main available patient characteristics: When the dependent variable is roughly continuous over strictly positive values but is zero for a large proportion of individuals, the Tobit approach is a more useful solution than standard regression modelling [4,5], and is characterized by latent variable modelling. The level 1 Tobit model can be described by the following equations: where i = 1, 2, ..., N index patients, y* is the latent variable measuring the real expenditure of the i-th patient, y measures the observed costs of the i-th patient, x i is the vector of patient characteristics, ß is the vector of parameters to be estimated, and ε i is an independent and identically distributed error term.
The hierarchical structure of the data could be exploited in order to specify a multilevel solution for the analysis. Therefore, multilevel Tobit models can be estimated [6][7][8]. The multilevel approach allows separating the total variability into two components, variability within and variability between GPs, controlling for the differences in the patient characteristics. Let j = 1, 2, …, J index GPs (level 2 units) and i = 1, 2, ..., n j index patients (level 1 units) assisted by the j -th GP. Let y* be a latent variable as before. The multilevel Tobit model is then defined as a latent variable model: Where x ij is the vector of characteristics of the i-th patient assisted by the j-th GP, ß is the vector of parameters to be estimated, u j is an independent and identically distributed level-2 error term and ε ij is an independent and identically distributed level 1 error term, assuming these error terms not correlated.
The intra class correlation coefficient (ICC) is a measure of the proportion of the total variability explained by the variability between groups [6]. In standard linear models the R 2 coefficient, which measures the explained proportion of variance, is the basic estimate of model goodness of fit. However, defining R 2 in this way for hierarchical linear models is rather problematic. An alternative indicator is obtained by defining R 2 (at the patient level) as the proportional reduction of error for predicting the level 1 dependent variable with respect to the model without any predictors [9]. For this reason, this indicator will be called R 2 1 in the rest of the paper. Models (1) and (2) were estimated using STATA software [10], the former by means of the tobit command, the latter through the gllamm (Generalized Linear Latent and Mixed Models) procedure [11]. Model (2) could have been estimated also by the xttobit command of the STATA software. However, the adaptive quadrature implemented in gllamm is superior in situations involving large cluster sizes [12,13]. A slow convergence was the price paid for this accuracy.

Results
The database analysed included 1,067,239 citizens registered with a GP in 2007 in the region. Considering that 37,029 (3.5%) persons changed their GP at least once during the year, leading to more than one record, the real number of patients (level 1 units) analysed was 1,105,759. Data came from 1,129 GPs, each of whom had an average of 1,109 citizens registered (standard deviation 392). The number of people with at least one ticket exemption was 461,532. Their mean age was 67.8 (standard deviation 17.3). Those having two or more exemptions were 35,964 (7.8%), leading to 505,932 records in the relative database. Table 2 shows the most frequent exemptions, which were   included in the models. Table 3 reports the number of records in the administrative databases regarding hospital and outpatient services and medications.
Three multilevel Tobit models were estimated: one model without any predictors, one model including only gender and age covariates, and the full model with variables accounting for Casemix patient characteristics added to age and gender in the linear predictor. Table  4 reports the reduction of prediction errors at the patient level when covariates are included into the multilevel Tobit model for total expenditures. Estimation results of the multilevel Tobit model for total expenditure are reported in Table 5.
The ICC shows in the multilevel model estimation that, controlling for the differences in the patient Casemix, the amount of GP-driven residual variability of the total individual health care tariffs is negligible (0.89%). The multilevel model produces similar results for the individual costs for inpatient services (ICC<0.1%), outpatient services (ICC=1.49%), and drug prescriptions (ICC=2.0%).
Based on this result, only the standard Tobit model (i.e., not accounting for the hierarchical structure of the data) was considered. This choice was also supported by the closeness of the estimated coefficients between the standard and the multilevel Tobit models. All patient variables resulted significantly and positively (except GP change) associated with the dependent variable in all models. The standard Tobit model, where each specific type of health service is considered a dependent variable, leads to similar conclusions ( Table  6).
None of the characteristics associated with GPs, such as age, gender, seniority, and working in groups or alone, showed any association with the dependent variable. All of these features were not statistically significant in predicting different tariff costs. These variables were therefore not included in the model.

Conclusion
Risk adjustment is a necessary step to properly evaluate outcomes and costs of medical care. By now, casemix adjustment is a nearly universal routine in the hospital setting. This is not the case with PHC. Although PCSs are available for PHC, as mentioned in the introduction, their use is not widespread, especially outside the US. The Adjusted Clinical Groups (ACG ® ), which were developed at Johns Hopkins, are an example of the most common PCS in PHC. In many studies they were used to categorise PHC data [14,15]. The Clinical Risk Groups (CRG™) are another well-known PCS developed by the same group that maintains DRGs for the Health Care Financing Administration of the US government [16,17].
However, ACG ® , CRG™, and other episode softwares require a level of data detail that can be found only in the patient record. Thus, they are built from providers' information systems, whether the providers are practices, single GPs, or more complex organizations [18]. It would be hard, and certainly impossible in Italy, to use ACG ® in studies covering all populations, which is what is needed to manage a NHS. Thus, evaluation should be based both on GPs and population data, and the analyses performed should take into account the contribution of people who had no contact with the NHS whatsoever. Other methods must be developed to avoid delaying PHC risk   adjustment in Italy. The easiest strategy is to consider administrative data bases.
Administrative databases are a convenient source of information for the management and evaluation of health care systems. They are already available and often provide low-cost data on large number of patients or citizens, or even the whole population of a given area. However, the quality of this data may not be excellent. Since Lisa Iezzoni proposed the use of administrative databases to evaluate the quality of health services [19], they have been increasingly used in many research areas [20]: quality of care assessment, estimating adherence to best practice, cost evaluation and epidemiology of selected diseases, potential benefits and harms of specific health policies, and disease and intervention registries. The number of published works on the topic is large, particularly those using hospital data, but a literature review is beyond the scope of this article.
In Italy, in addition to studies based on hospital discharge abstracts, attention is paid to databases of medications and outpatient contacts. For instance, a large epidemiologic multicentre study estimated incidence and prevalence of a number of chronic diseases [21]. The Lombardia Regional Health Authority has developed a data warehouse that combines a number of health administrative databases containing data on a population of eight million people   [22]. Other studies have used administrative databases to evaluate appropriateness of drug prescriptions [23,24]. In Friuli Venezia Giulia, the Regional Health Authority has developed a data warehouse system that contains several years of data from administrative databases and allows for fast data mining. This system has been widely used for health service evaluation and epidemiological studies [25][26][27][28][29][30].
In our study the issue was whether the clinical information provided by the ticket exemption file would enable us to build a predictive model. Given the purely administrative purpose goal of the exemption, in fact, epidemiologists are sceptical about its real value as source of information for risk adjustment. The authors know of only one other large study that uses ticket exemptions [22].
Models that use ticket exemption information had robust results. Since the exemption is granted after a specialty physician diagnosis based on standard criteria, it can be regarded as very reliable. However, there are quality issues in the data. For instance, the prevalence of diabetes exemptions is much lower than is reported in the literature [31]. Nevertheless, the available information allow to derive good estimates of the trend of tariffs due to the large number of individuals. Estimated coefficients have a narrow confidence interval, and the percentage of error reduction R 1 2 (the equivalent of the variance explained in Tobit statistical models) is large.
In our model age has an important predictive value. The age coefficient increases steadily in models considering the sum of all expenditures, and the oldest patients use ten times more resources than patients aged 35 -44 when all other variables are equal. This is even more evident if we look at the inpatient tariffs. This means that outpatient care (remember that for all outpatient contacts, radiology and lab tests are included) is a resource commonly used by all age groups, and this may conceal a degree of inappropriateness in the use of diagnostic resources.
Women tend to spend more in all sectors, even after controlling for pregnancy. It is worth noting that being pregnant has a protective value in medication expenditures. Considering total expenditures, cancer and diabetes are the health conditions that consume the most resources. In the other models, the costliest illnesses are cancer, diabetes, and hypertension in the outpatient model; cancer and cardiovascular disease in the hospitalization model; and hypertension and diabetes in the medication cost model (this is probably because cancer medications are given in hospitals and do not fully appear in this data). Even though there are substantial differences in the statistical methodology and underlying organization of the health services, the variance explained by our models is largely comparable to the values found in the development of the ACG ® at Yale [14]. However, our model maintained its power when analysing the total charges, while the ACG ® system sees a significant drop in the variance explained. The comparison with CRG™ is more difficult, since several models were developed from different data sources using a number of different approaches. Nonetheless, the R 2 reported ranged from 0.12 to 0.14, significantly lower than our findings [17].
In Italy there are grounds for adopting the use of administrative databases--ticket exemption databases in particular--for more complex tasks as well, such as risk adjustment in PHC. The inclusion of casemix patient characteristics into the models has a strong predictive power. Their inclusion significantly reduced residual cost heterogeneity in the model predicting the total expenditure at the patient level compared to the model without any predictor or to the model including gender and age only, and is therefore highly recommended.
The most surprising finding, however, was that, controlling for Casemix, the variability of the total expenditures at the GP level is very low. Variation in medical behaviour is a well-known phenomenon widely described in the literature, and many hypotheses have been developed to explain it [32][33][34][35][36][37]. Only a few studies have so far investigated differences among practices after any sort of risk adjustment (apart from the usual sex and age standardization). For instance a study carried out in an area close to the Friuli Venezia Giulia Region found almost no variance ascribed to GPs in the distribution of hospitalization tariffs after adjustment for PCS [38]. However, when considering medication cost [39], a residual variance remained, even though a marked decrease in the overall variance was observed after applying risk adjustment.
Comparison with other studies is not straightforward, given the differences in the study design and in the dependent variables. A study on the number of home visits reported a statistical analysis similar to the one of the present study, showing an ICC of 1.6% after adjustment [40]. This result is very similar to the one found in the outpatient services model, even though the services and the dependent variable (tariff instead of service frequency) considered in the analysis are quite different. A number of studies adopted a multilevel approach as a statistical solution, assuming the frequency of services as a dependent variable (encounters, referrals, or prescriptions). In one study the variance unexplained by the models at the practice level was between 12.9% and 27.2% according to the type of service [41]. Two more studies demonstrated a much lower residual unexplained variance at the practice level, comparable to the one found in the present study (0.1% for prescription number [42] and 3.6% for referrals [43]). A study on prescription cost showed a low variance explained by physicians after risk adjustment (1.8%) [44]. A generalization derived from this picture is that application of risk adjustment models explains a large quota of variation, and decreases the variation attributable to a single GP or practice. However, in the present study this effect appears much stronger and more evident, considering that the average number of citizens registered with a GP (around 1,100) is much lower than in the UK. This study has some particular features that may help to explain homogeneous behaviour of GPs. First, it is population-based (the entire population of Friuli Venezia Giulia); that is, it examines data of all patients without any socio-economic bias. Second, the population study is very large, much larger than the study population of other research. Third, it considered all major services provided to the population. Fourth, it was conducted in a region among those with the lowest per capita overall expenditure: the lowest hospitalization rate and the lowest per capita medication expenditure in Italy [45]. Fifth, it used the sum of service tariffs or medication prizes as a dependent variable, which is here assumed as a proxy of the burden on the health system of caring for individuals or groups of patients with common conditions. Could some of these features be associated with the low variance attributable to GPs?
The size of the population gives power to the statistical results. The comprehensiveness of the services involved in the analysis demonstrates that physician behaviour is not related to a single part of the caring process, but can be regarded as a generalised professional attitude. The use of tariffs can change the results, especially when considering non-homogenous services. This is particularly true for hospital care. Usually, indicators include hospitalization rates, but it is quite obvious that there is a difference among hospitalizations for pathologies of different burden. The use of tariffs, which are derived from DRG weights, take this burden into account. Our hypothesis is that in the Friuli Venezia Giulia Region, inpatient services are devoted to patients with homogenous severity of conditions. The hospital use appears highly appropriate when comparing regional rates of hospitalization with national rates (Friuli Venezia Giulia has the second lowest age and sex standardised hospitalization rate among Italian regions [46], and the highest rate of utilization of home care [45]). This is no longer true for other types of services, such as medications, where a non-optimal adherence to clinical guidelines is documented [24]. However, a larger variance due to GPs is observed for medication use.
Risk adjustment is then necessary for PHC indicators. Outcome and output measures without proper risk adjustment may lead to biased findings. Age and sex adjustment appears to be insufficient to adequately account for differences among GP patient conditions.
The major weaknesses of this study appear to be the following: 1. The short period of observation (one year). This is largely compensated for by the size of the study population.
2. The lack of drug cost data on people in nursing homes. These are high consumers of medication, and also usually have several ticket exemptions. Thus this may affect the model coefficients.
3. Ticket exemptions are issued for chronic conditions only. Thus acute conditions and accidents are not considered in the models.
4. Data quality problems in the ticket exemption database. These are of two kinds: first, as already stated, the completeness of the information (many people who have a given condition do not have the corresponding exemption); second, the diagnosis specificity (for instance, cancer is considered only one category, the same for diabetes, and so forth). Given the power of the models, however, these weaknesses may become an asset; if introduced for relevant purposes, ticket exemption information would certainly increase in quality, and the categories may be modified accordingly. This would further increase the power of the models and their capacity to explain variation.