Modeling Health Insurance Claims with Extreme Observations

The case study of Iran Insurance Company

 

Mahsa Mir Maroufi Zibandeh

Allameh Tabataba ' I University, Eco College of Insurance, Tehran, Iran

*Corresponding Author E-mail:

 

ABSTRACT:

In modeling insurance claims, when there are extreme observations in the data, the commonly used loss distributions are often able to fit the bulk of the data well but fail to do so at the tail.

 

One approach to overcome this problem is to focus only on the extreme observations and model them with the generalized Pareto distribution as supported by extreme value theory. However, this approach discards useful information about the small and medium-sized claims which is important for many actuarial purposes. In this study we consider modeling large skewed data using a highly flexible distribution, the generalized lambda distribution, and the recently proposed semiparametric transformed kernel density estimation.

 

Considering the medical claim of Iran insurance company in 1389 and 1390, we have observed that the data is strongly skewed to the right. By applying our models for no threshold data, the transformed kernel and GPD model fit well to medical claims but GLD model is not good enough in modeling higher claims. For claims above 15,000,000 all models fit the empirical data well. Finally, Value at Risk estimation is given. We suggest using the transformed kernel density to estimate loss distribution based on the results. Consequently, losses can be estimated more accurately. Also the relevant premium can be charged and as a result of that, insurance company can witness a decline in loss ratio.

 

KEY WORDS: Extreme value theory; Kernel density; Value-at-risk; Generalized Pareto distribution; Generalized lambda distribution.

 


 

INTRODUCTION:

With a population of almost 70 million, Iran is one of the most populous countries in the Middle East. Total healthcare spending is expected to rise from $24.3 billion in 2008 to $50 billion by 2013 that reflects the increasing demand on medical services. Total health spending was equivalent to 4.2% of GDP in Iran in 2005. 73% of all Iranians have health care coverage.

 

On the other hand, according to declaration of the central insurance, loss rate of health insurance was 46.12% and 39.12% in 1388 and 1389 respectively.  Central Insurance of Iran has also declared in its official website that sum of health insurance produced premium by Commercial insurance companies was 9850 billion Rials in 1389 and the compensation was about 9875 billion Rials which means loss ratio is 1.003 in 1389.Furthermore, sum of health insurance produced premium by commercial insurance companies was 15834 billion Rials in 1390 and the compensation was about 14661 billion Rials. Therefore, loss ratio is 0.926 in 1390. Loss ratio for health insurance in Iran insurance company was 0.94 and 0.9 in 1389 and 1390 respectively.  In comparison to some insurance lines in our country, health insurance has higher loss ratio.

 

This higher ratio may increase the potential loss of insurance company so that it is not an economical cost-effective activity for an insurance company. Regarding this high loss rate, the insurers have to estimate and forecast their loss distribution with more accurate modeling tools in order to compensate their insured and continue their activity. These estimations aid insurers to be able to set premiums fully align with their losses. Although detecting the extreme losses in this line occurs with lower probability, it has considerable importance and effects on estimation of future losses.

 

Recently, fat tailed distributions and extreme value theory are used by Iranian risk managements in modeling large claims for different lines such as fire insurance and catastrophe claims in reinsurance. These studies in general show importance of these models in estimating insurance claims.

 

The specification of a loss distribution is a key ingredient of any modeling approach in actuarial science and financial risk management. For insurers, a proper assessment of the size of a single claim is of most importance. Traditional methods in the actuarial literature use parametric specifications to model single claims.

 

A method which does not require the specification of a parametric model is nonparametric kernel smoothing. This method provides valid inference under a much broader class of structures than those imposed by parametric models.

 

The detailed analysis of the available models leads to many unsolved problems of theoretical and practical importance, and this field of research always generates new challenges. The present contributions are further piece of this big puzzle.

 

Health insurance is one of the insurance services. Humankind has always been in danger of lots of diseases, so people need to sponsor the charge of all this treatments. In order to help people in those situations, insurance companies represent various kinds of health insurance. Also, government employee or industrial and production units in most of the countries use the group health insurance.

 

According to official data, more than 90% of Iranian people are under the coverage of at least one kind of health insurance. The boundaries of providing health services for patients is so much expanded that it is not at least an economical cost-effective activity in the framework of the health insurances. In many countries the complementary health insurances have been used to provide these services.

In complementary health insurance coverage, the only factor in terminating the contract of insurance is failure to pay premiums. The coverage is based on the fact that insured is able to pay premiums.

 

So that at first, the insured pays the total costs and then the insurance company refund the amount to the insured up to entire coverage.

 

Using insurances in terms of governmental and private complementary health insurance and creating the competition among them might have an important role on improvement of health insurance quality, raising level of customers' satisfactions, and finally improvement of public health (Vafaee et al., 2007).

 

LITERATURE:

Standard statistical methodology such as integrated error and likelihood does not weigh small and big losses differently in the evaluation of an estimator. Thus, these evaluation methods do not emphasize on important part of the error: the error in the tail.

 

Practitioners often decide to analyze large and small losses separately because no single, classical parametric model fits all claim sizes. This approach leaves some important challenges: choosing the appropriate parametric model, identifying the best way of estimating the parameters and determining the threshold level between large and small losses.

 

A method which does not require the specification of a parametric model is nonparametric kernel smoothing. This method provides valid inference under a much broader class of structures than those imposed by parametric models.

 

The detailed analysis of the available models leads to many unsolved problems of theoretical and practical importance, and this field of research always generates new challenges. The present contributions are further piece of this big puzzle.

 

Paul Embrechts et al. (1997) provided a textbook of modeling external events for insurance and finance. This book is a very comprehensive textbook of probabilistic models for large and external values of sequences of random variables and of the statistical problems involved in fitting appropriate distributions to empirical data of such a type. The presentation of the statistical methods is amply illustrated by a wealth of concrete examples of data analysis from insurance, finance, hydrology etc. recent results concerning ruin probabilities when the claim distributions have heavy tails are presented.

 

McNeil (1997) provided an extensive overview of the role of extreme value theory in risk management, as a method for modeling and measuring extreme risks. Bolace et al. (2003) also suggested using the kernel density estimation for actuarial loss functions. They estimated actuarial loss functions based on a symmetric version of the semi parametric transformation approach to kernel smoothing. They applied this method to an actuarial study of automobile claims. The method gives a good overall impression for estimating actuarial loss functions, and it is capable of estimating both the initial mode and the heavy tail that is so typical for actuarial and other economic loss distribution. They studied properties of the transformation kernel density estimation and showed the differences with the multiplicative bias corrected estimator with variable bandwidth. They also added insight into the kernel smoothing transformation method through an extensive simulation study with a particular view to the performance of the estimation at the tail.

 

Bolace et al (2005) studied kernel density estimation for heavy-tailed distributions using the Champernowne Transformation. In this study, a unified approach to the estimation of loss distributions was presented. They proposed an estimator obtained by transforming the data set with a modification of the Champernowne cdf and then estimated the density of the transformed data by using the classical kernel density estimator. They investigated the asymptotic bias and variance of the proposed estimator. In a simulation study, the proposed method showed a good performance. They also presented two applications dealing with claims costs in insurance.

 

Yamada and Primbs (2004) presented a Value-at-Risk and Conditional Value-at-Risk estimation technique for dynamic hedging and investigated the effect of higher order moments in the underlying on the hedging loss distributions. At first, they approximated the underlying stock process through its first four moments including skewness and kurtosis using a general parameterization of multinomial lattices, and solved the mean square optimal hedging problem. Then they plugged the moment information into the generalized lambda distribution to extract the hedging loss distribution, and estimated its VaR. Finally, they demonstrated how the hedging error distribution changes with respect to non-zero kurtosis and skewness in the underlying through numerical experiments, and examined the relation between VaR and CVaR of the hedging loss distributions and kurtosis of the underlying.

 

Gustafsson et al (2007) developed a tailor made semiparametric asymmetric kernel density estimator for the estimation of actuarial loss distributions. The estimator was obtained by transforming the data with the generalized Champernowne distribution initially fitted to the data. Then the density of the transformed data was estimated by using local asymmetric kernel methods to obtain superior estimation properties in the tails. They have found in a vast simulation study that the proposed semiparametric estimation procedure performs well relative to alternative estimators. The approach should therefore be useful in applied work in economics, finance and actuarial science involving non- and semi-parametric techniques. This point has been demonstrated with an empirical application to operational loss data.

Chiang Lee (2009) focused on modeling and estimating tail parameters of commercial fire loss severity. Using extreme value theory, he centralized on the generalized Pareto distribution (GPD) and compared with standard parametric modeling based on Lognormal, Exponential, Gamma and Weibull distributions. In empirical study, the thresholds of GPD are determined through mean excess plot and Hill plot. Kolmogorv-Smirnov and LR goodness-of-fit test was conducted to assess how good the fit was. VaR and expected shortfall were also calculated. He also took into account bootstrap method to estimate the confidence interval of parameters. Empirical results show that the GPD method is a theoretically well supported technique for fitting a parametric distribution to the tail of an unknown underlying distribution. It can capture the tail behavior of commercial fire insurance loss very well.

 

THEORETICAL PRINCIPLES:

In finance and non-life insurance, estimation of loss distributions is a fundamental part of the business. In most situations, losses are small, and extreme losses are rarely observed but the number and the size of extreme losses can have a substantial influence on the profit of the company. In fact, for estimating loss distributions in insurance, large and small losses are usually split because it is difficult to find a simple parametric model that fits all claim sizes.

 

In This sector, we present parametric and nonparametric model to estimate the claims distribution. For each model we employ these steps: choosing the appropriate parametric model, identifying the best way of estimating the parameters and determining the threshold level between large and small losses.

 

KERNEL DENSITY ESTIMATION:

Kernel density estimators belong to a class of estimators called non-parametric density estimators. In comparison to parametric estimators where the estimator has a fixed functional form (structure) and the parameters of this function are the only information we need to store, Non-parametric estimators have no fixed structure and depend upon all the data points to reach an estimate. This estimator is widely used in practice despite the known fact that smoothing can produce efficiency gains in finite samples.

 

In kernel density estimation, the shape of the estimated density is determined by the data, and in principle, given a sufficiently large data set, the technique is capable of estimating an arbitrary densityfairly accurate. It is a nonparametric method that does not make any distributional assumption about the underlying density. Kernel density estimation has attracted the attention of many researchers; a good introduction to the subject is given in Silverman (1986).

 

Let be a random sample from an unknown density function   Then the kernel density estimator of is given by

 

 

 

Where the kernel function is generally a unimodal probability density function and h(> 0) is a smoothing parameter often called the bandwidth.

 

The basic properties of at interior points are well-known, and under some smoothness assumptions these include

 

 

 

 

 

 

The most common optimality criterion used to select this parameter is the expected L2 risk function, also known as the mean integrated squared error

 

 

 

The semiparametric transformed kernel density estimator

Transformed kernel density estimator presents a systematic approach to the estimation of loss distributions which is suitable for heavy tailed situations. The proposed estimator is obtained by transforming the data set with a parametric estimator and afterwards estimating the density of the transformed data set using the classical kernel density estimator.

 

Let , be positive stochastic variables with an unknown cdfand density . The following describes in detail the transformation kernel density estimator.

 

(i)   Estimate the parameters  of  by maximizing

(ii)  

 

 

Where

(iii)           Transform the data set , by the estimated cdf,

 

 

 

The transformation function transforms data into the interval, and the parameter estimation is designed to make the transformed data as close to a uniform distribution as possible.

 

(iv) Calculate the classical kernel density estimator on the transformed data, Yi, i= 1, . . . , N:

 

 

 

Where and is the kernel function. The boundary correction, ,is required because the  is in the interval so that we need to divide it by the integral of the part of the kernel function that lies in this interval. The boundary correction is defined as

 

.

 

 

(iv) Obtain the kernel density for the original data, by back transformation:

 

We will call this method a semi parametric estimation procedure because a parameterized transformation family is used.

 

Champernowne distribution

We use a transformation based on the little-known Champernowne cdf, because it produces good results in all the studied situations and it is straightforward to apply.

 

 

 

The original Champernowne distribution has the density

 

 

 

Where is a normalizing constant and and M are parameters. The distribution was mentioned for the first time in 1936 by D.G.

 

When  equals 1 and the normalizing constant c equals, the density of the original distribution is simply called the Champernowne distribution

 

 

 

With cdf

 

 

The Champernowne distribution converges to a Pareto distribution in the tail, while looking more like a lognormal distribution near 0 when. Its density is either 0 or infinity at 0 (unless).

 

Parametric models

Although the empirical distribution function can be a useful tool in understanding claims data, there is often a natural desire to “fit” a probability distribution with reasonably tractable mathematical properties to claims data.

 

The claims actuary will also want to consider the impact of deductibles, reinsurance arrangements and inflation on that part of a claim which will be handled by the base insurance company. This involves a good understanding of conditional probabilities and distributions. For example, if X is a typical claim this year and inflation of size i is expected next year, then what is the distribution of? If the excess of any claim over M is to be handled by a reinsurer, what is the typical claim distribution for the base insurer?

 

Generalized lambda distribution

This distribution was first introduced by Tukey (1960) and later generalized to the four-parameter case by Ramberg and Schmeiser (1974). It can produce a wide variety of curve shapes including that of many standard symmetric and skewed distributions. The GLD fit has been successfully applied in a variety of disciplines. These include modeling of quantile response in bioassay and economics, meteorology, and engineering and quality management.

In Ramberg and Schmeiser's, the probability density function of the generalized lambda distribution with parameters is given by

 

 

 

With the quantile function Q(y) given by

 

 

 

Where  and  are location and scale parameters respectively, and  , are shape parameters (skewness and kurtosis, respectively).

 

Generalized Pareto Distribution

From the fundamental Fisher-Tippett (1928) theorem in classical extreme value theory, we know that if  is a sequence of independently and identically distributed random variables with a common distribution function which has mean (location parameter) and variance (scale parameter) denote the sample maxima of  by, and let denote the real line. Given a sequence of and some non-degenerate distribution function  such that

 

 

 

Where, then H must be of one of the three types of extreme value distributions: Frqechet, Weibull or Gumbel distribution.

 

The extreme value distributions are closely related to the generalized Pareto distribution, which describes the limit distribution of excesses over a high threshold. For sufficiently high threshold, the distribution function of the excess may be approximated by the generalized Pareto distribution (GPD) (Balkema and de Haan(1974)and Pickands(1975)), because as the threshold gets large, the excess distribution converges to the GPD.

 

 

 

The GPD in general is defined as:

 

 

 

 

With

 

 

Where is the shape parameter,  is the tail index,  is the scale parameter, and is the location parameter.

 

Value at Risk Estimation in kernel density estimation

In nonparametric approach which used historical claims such as historical simulation, value at risk is calculated directly by taking the desired percentile of the distribution of losses. The VaR in this case is estimated by

 

 

 

Whereis the th quantile  of the sample distribution.

 

Value at Risk Estimation in GPD distribution

The data points in the tail  are represented by

 

For large, we can estimate by. Also can be estimated from the data by  

 

Therefore, the tail estimate is

 

 

 

This approximates the distribution of F(x). It can be shown that  is GPD which has the same shape parameter value (k) of F(x). Given a threshold, an estimate of may be obtained as  where n is the sample size and is the number of exceeding. The tail estimator, is calculated by

 

 

 

For a given probability q >F (q), a percentile  at the tail is estimated by inverting the tail estimator:

 

 

 

Where  is a threshold, is the estimated scale parameter, is the estimated shape parameter, is the sample size, is the number of exceedances and 

 

 

 

Empirical work

In this chapter we are going to present empirical result of fitting transformed kernel estimation, generalized pareto distribution and generalized lambda distribution. For practical analysis, we consider medical claims of Iran insurance company in 1389 and 1390. First, we use simulation study for investigating the best model to use in transformed kernel density estimation. Then we fit the relevant distribution and compare the estimation distribution to select the distribution which provides best fit to the claims. For data analysis we use MATLAB (by using packages evim and bounds_matlab) and R software (by using packages gld, POT and kernlab).

 

Simulation study on transformed kernel estimation

In this section we report the results of a simulation study to investigate the relative performance of some selected cdf transformations used in the transformed kernel density estimation.

 

We consider four cdf transformations, namely, the lognormal (LN), generalized Pareto, Champernowne, and modified Champernowne. The choice of these transformations is motivated by the fact that the lognormal and generalized Pareto are commonly used to model insurance loss data (Embrechts, Kluppelberg, and Mikosch 1997; Klugman and Rioux 2006), while the Champernowne distribution approaches a form of the Pareto distribution for the extreme values (Fisk 1961) and approximates the lognormal distribution for values near zero in some cases (Balasooriya and low, 2008).

For this simulation study, data are generated from GPD and LN distributions with selected parameter values that represent different shapes of the underlying distributions.

For each generated sample, four transformed kernel densities are obtained using the Champernowne, modified Champernowne, lognormal, and generalized Pareto cdf transformations as outlined and illustrated in chapter 3. In assessing the goodness-of-fit of these estimated densities, we employ global distance measure criteria.

 

Among four cdf transformations, the GPD performs the best. For all the sample sizes, parameter values and both underlying data-generating distributions, GPD cdf transformation yields the highest percentages in Table 1. When the data-generating distribution is LN, it gives higher percentages than the LN cdf transformation. This shows the robustness of the GPD transformation to the data-generating process.

 

 

 

 


 

Table 1:Error Rates for Transformed and Classical Kernel Density Estimation Using       the Global Distance Criterion

 

Cdf Transformation

n

Image

(0.2,1,0)

(0.2,1,0)

(0.2,1,0)

Champernowne

100

0.0049

0.0040

0.0031

 

 

[0.0032]

[0.0516]

[0.0037]

 

250

0.0023

0.0010

0.0059

 

 

[0.0025]

[0.0120]

[0.0018]

 

500

0.0043

0.0048

0.0036

 

 

[0.0022]

[0.0092]

[0.0059]

Modified Champernowne

100

0.0016

0.0030

0.0052

 

 

[0.0031]

[0.0015]

[0.0046]

 

250

0.0011

0.0028

0.0063

 

 

[0.0025]

[0.0091]

[0.0082]

 

500

0.0011

0.0018

0.0053

 

 

[0.0022]

[0.0093]

[0.0063]

GPD

100

0.0040

0.0043

0.0052

 

 

[0.0121]

[0.0087]

[0.0195]

 

250

0.0015

0.0016

0.0019

 

 

[0.0046]

[0.0041]

[0.0036]

 

500

0.0071

0.0078

0.0029

 

 

[0.0082]

[0.0029]

[0.0047]

LN

100

0.0052

0.0011

0.0031

 

 

[0.0072]

[0.0038]

[0.0029]

 

250

0.0019

0.0046

0.0036

 

 

[0.0015]

[0.0051]

[0.0049]

 

500

0.0071

0.0043

0.0028

 

 

[0.0065]

[0.0041]

[0.0031]

Values in square brackets refer to classical kernel density estimation

Source: author calculation

 

 

Table 2:Error Rates for Transformed and Classical Kernel Density Estimation Using the Global Distance Criterion

 

Cdf Transformation

n

Image

(0,0.5)

(0,1)

(0,1.25)

Champernowne

100

0.0151

0.0020

0.0062

 

 

[0.0156]

[0.0030]

[0.0142]

 

250

0.0060

0.0092

0.0069

 

 

[0.0067]

[0.0121]

[0.0102]

 

500

0.0045

0.0057

0.0021

 

 

[0.0031]

[0.0180]

[0.0077]

Modified Champernowne

100

0.0069

0.0052

0.0152

 

 

[0.0201]

[0.0032]

[0.0150]

 

250

0.0026

0.0011

0.0045

 

 

[0.0036]

[0.0022]

[0.0105]

 

500

0.0013

0.0014

0.0013

 

 

[0.0038]

[0.0018]

[0.0079]

GPD

100

0.0012

0.0010

0.0010

 

 

[0.0032]

[0.0016]

[0.0037]

 

250

0.0045

0.0040

0.0034

 

 

[0.0095]

[0.0051]

[0.0048]

 

500

0.0020

0.0018

0.0015

 

 

[0.0022]

[0.0032]

[0.0029]

LN

100

0.0032

0.0011

0.0011

 

 

[0.0032]

[0.0016]

[0.0037]

 

250

0.0050

0.0096

0.0020

 

 

[0.0059]

[0.0120]

[0.0018]

 

500

0.0024

0.0027

0.0048

 

 

[0.0022]

[0.0032]

[0.0059]

Values in square brackets refer to classical kernel density estimation

 

 

 

 

 

 

 


Analysis of medical claims

In this section we report our attempt to model medical claims data of Iran insurance company using two parametric distributions, GLD and GPD and the semi parametric transformed kernel density as outlined and illustrated in chapter 2. The data consist of all claim amounts exceeding 250,000 over the period 1/1–6/31 for the year 1389 and 1390. For our analysis we consider the total claim amount. The 1389 data contains 109,398 observations with a mean of 5928700. The bulk of the observations lie below 25,000,000, but there are a significant number of very high claims, the largest being 393,773,548. Therefore, the data are strongly skewed to the right with a skewness coefficient of 7.5829.

 

Only the 1389 data are used for the estimation; the 1390 data are used as a holdout sample to assess the out-of-sample performance of the estimated models. The sample histogram of two data claims is presented in Figure 1. The figure shows that the medical claims are highly skewed. In addition, we can observe that:

 

 


 

 

Figure 1: Histogram of medical claims data of Iran insurance company. a) The 1389 data of. b) The 1390 data

 

The descriptive statistics of medical claims for 1389 and 1390 data set are given in table 3 and table 4 respectively.  Two data sets have significant Skewness and Kurtosis.

 

Table 3: Descriptive Statistics of medical claims data of Iran insurance company for the year 1389

 

N

Minimum

Maximum

Mean

Std. Deviation

Skewness

Kurtosis

Claims

109392

250200

393773548

5.9287e+006

1.2761e+007

7.5829

113.6428

The number of claims, maximum, minimum, mean, standard deviation, skewness and kurtosis of claims are presented respectively in the table. Source: author calculation. Source: author calculation

 

 

Table 4: Descriptive Statistics of medical claims data of Iran insurance company for the year 1390

 

N

Minimum

Maximum

Mean

Std. Deviation

Skewness

Kurtosis

Claims

74494

250014

39269850

6.6551e+006

1.5642e+007

8.5590

124.3420

The number of claims, maximum, minimum,   mean, standard deviation, skewness and kurtosis of claims are presented respectively in the table. Source: author calculation. Source: author calculation.

 

 


 

Figure 2: mean excess (ME) plot. The horizontal axis is for thresholds over which the sample mean of the excesses are calculated. Values on the vertical axis display the corresponding mean excesses.

 

 

Figure 3: Variation of the Hill estimate of the shape parameter across the number of upper order statistics. The stable region is specified in the figure. The number of upper order statistics restricted up to 5000.


 

 


The mean excess (ME) plot of medical insurance claims for data set 1 is presented in figure 2.There is approximately linear positive trend in an ME plot and it indicates that the claims distribution has heavy tail.

 

One crucial step in using the GPD in practice for large loss modeling is the choice of threshold from where the data set is assumed to follow a GPD. The choice of threshold is a classical bias-variance trade-off: choosing the threshold too low means that assuming the limiting GPD is not appropriate, whereas choosing the threshold too high means that we have too few data points to estimate the GPD parameters. Graphical methods are often used in the choice of threshold (Kromann, 2009).

 

The Hill plot of claims is displayed in the Figure 3. The estimator for the shape parameter should be chosen from a region where the estimate is relatively stable. The number of upper order statistics or thresholds is restricted to investigate the stable part of the Hill-plot. The stable portion of this figure implies a tail index estimate of 0.6. In other words the positive shape parameter indicates the existence of fattailness in claims distribution. Therefore we choose 10000 exceeding results in threshold value of 15,000,000.

 

Table 5 compares the deciles of the estimated distributions with the empirical deciles. For no threshold data, the empirical deciles are very close to the corresponding values of the fitted transformed kernel but the result of GLD model does not match with empirical quantiles.

 


 

Table 5: Estimation of quintile of medical claims without threshold.

Quantile

Empirical 89

Empirical 90

Kernel

GLD

10%

350000

356000

2094360

2639400

20%

478030

500000

4871950

2780252

30%

630000

687000

7421920

2923850

40%

930500

1000000

10530920

3070233

50%

1500000

1559100

15677590

3219718

60%

2628896

2656487

26523160

3372888

70%

4896455

5000000

48854540

3530742

80%

8211747

9168069

82741630

3695135

90%

14651508

16562508

146488480

3870363

Source: author calculation

 


Comparison of empirical quantiles of medical claims data above threshold value of 15,000,000 with transformed kernel density, generalized pareto, and generalized lambda is presented in Table 6. All models fit the empirical data well. We can see that the GLD model fitted on the claims above 15,000,000 has good result in compare with empirical quantiles.

 


 

Table 6: Estimation of quantile of medical claims above 15,000,000 threshold.

Quantile

Empirical 89

Empirical 90

Kernel

GLD

GPD

10%

15878453

16388428

15517729

16014676

16262353

20%

17111497

17782963

17371007

17393636

17737072

30%

18928068

19425043

19196365

19125283

19493982

40%

21083760

21500000

21455877

21326304

21640688

50%

24416356

24822709

24598916

24184211

24354098

60%

28700000

29167552

28796238

28036358

27951944

70%

34258564

34998705

34504396

33583272

33083197

80%

44411766

45372252

44479985

42639405

41379261

90%

61908294

63751982

61973335

62449601

59029297

Source: author calculation

 

 

Table 7: Estimation of value at risk of medical claims.

Model

Value at Risk

95%

975%

99%

995%

999%

Empirical 89

23782652

37635070

60839804

79022941

1.35E+08

Empirical 90

27031981

42358593

67095632

90515601

1.98E+08

Kernel

23773979

37691651

60860833

79050828

1.35E+08

GPD

23813377

35938652

58007212

80885396

1.66E+08

Source: author calculation

 


Empirical comparisons of value at risk estimators

The value at risk estimation using kernel density estimator and GPD distribution are presented in table 7. Both models result close estimation of empirical value at risk for data set 1389 and 1390.

 

Comparison of value at risk estimation for GPD model and kernel density estimator with empirical value at risk for two data sets 1389 and 1390 are shown in figure 4. This figure demonstrates that two model estimate the empirical value at risk well.

 

 

Figure 4: comparison of value at risk estimation of GPD model and kernel density estimator with empirical values.

 

 

CONCLUSION:

In Iran, the main public health insurers include the social security organization and the Medical service insurance organization.

 

The boundaries of providing health services for patients is so much expanded that it is not at least an economical cost-effective activity in the framework of the health insurances. In many countries the complementary health insurances have been used to provide these services.

 

The complementary insurance is applied to fill gaps between cost and based health insurance service. The main advantages of these systems are in direct proportion to the benefits provided in the basic health insurance system.

 

For insurers, a proper assessment of the size of a single claim is of most importance. Insurance companies need to investigate claims experience and apply mathematical techniques for many purposes such as rating, reserving, reinsurance arrangements and solvency. In fact, the total amount of claims in a particular time period is a quantity of fundamental importance to the proper management of an insurance company.

 

In this study we will look at loss distributions, which are a mathematical method of modeling individual claims. Actually, in fitting distribution to insurance loss data, several families of distribution have been proposed. The common characteristic of these distributions are their skewness to the right and their tails to capture occasional large values that commonly are present in insurance loss data. One fundamental question conforming actuaries and other researchers, however, is the approach used to select the best model for a given data set (Balasooriya, 2005).

 

Then in the second part of this thesis, parametric and nonparametric specifications to model single claim have been mentioned. The detailed analysis of the available models leads to many unsolved problems of theoretical and practical importance, and this field of research always generates new challenges.

 

Three models have been considered that have appropriate characteristic in modeling financial data. Our models include transformed kernel density, generalized Pareto distribution and generalized lambda distribution.

Our hypothesis is that the transformed kernel density estimation as a semiparametric approach has better performance than GLD and GPD models. It is also more appropriate for heavy tailed distribution and yields more accurate estimation.

 

We applied these models to medical claims amounts exceeding 250,000 of Iran insurance company in the year 1389 and compared the estimation result with real claim of the year 1390.

 

We have also utilized simulation study for investigating the best model to use in transformed kernel density estimation. The results of this analysis showed that in general, transformed kernel performs better than the classical kernel, and GPD is particularly the best model that can be used in transformed kernel density.

 

In the next step, we have proposed our models to medical claims data of Iran insurance company. The results of primary analysis showed that the claim distribution is strongly skewed to the right.

 

By considering the whole claims data set, transformed kernel and GPD models-estimations provide reasonable results. However, GLD model is not good enough in modeling higher claims.

 

With claims above threshold value of 15,000,000, we can observe that transformed kernel density, generalized Pareto, and generalized lambda fit the empirical data well.

 

Finally, comparison of value at risk estimation indicates that transformed kernel density and generalized Pareto model estimate the empirical value at risk well. These estimations are also very close to empirical value at risk for 1390 claims and can apply for forecasting the claim in the year 1390.

 

REFERENCES:

Balasooriya, U. and Low, C. K. (2008).modeling insurance claims with extreme observations. transformed kernel density and generalized lambda distribution. North American actuarial journal 12.129-142.

Balasooriya, U., Low, C. K., Wong, A. (2005). modeling insurance loss data. the log-EIG distribution. Journal of actuarial practice 12.101-125.

Balkema, A., and de Haan, L. (1974).Residual lifetime at great age. Annals of Probability.2. 792–804.

Chiang Lee, W. (2009). Applying Generalized Pareto Distribution to the Risk Management of Commerce Fire Insurance. Department of Banking and Finance Tamkang University of Taiwan.

Embrechts, P., Kluppelberg, C., Mikosch, C. (1997). Modeling Extremal Events for Insurance and Finance.Springer. Berlin.

Fisher, R. A., andTippett, L. H. C., (1928).Limiting forms of the frequency distribution of the largest or smallest member of a sample. Proceedings of the Cambridge Philosophical Society 24.180–190.

Gustafsson,J., Hagmann, M.,Nielsen, J. P.,Scaillet, O. (2007).Local Transformation Kernel Density Estimation of Loss Distributions. Swiss National Science Foundation.

Kromann, T. B. (2009). Comparison of Tail Performance of the Champernowne transformed Kernel Density Estimator. The Generalized Pareto Distribution and the g-and-h distribution. Universitetsparken 5.

McNeil, A. J., (1997). Estimating the tails of loss severity distributions using extreme value theory. Astin Bulletin 27. 1117-1137.

Pickands, J. (1975). Statistical inference using extreme order statistics. Annals of Statistics.3. 119–131.

Ramberg, J. S., andschmeiser, B. W., (1974).An approximate Method for generating symmetric random variables. Communications of the ACM 15.987–90.

Risk Factors for Individual Health Care Insurance | eHow.comhttp://www.ehow.com/about_5175825_risk-individual-health-care-insurance.html#ixzz1qttNOiDl

Silverman,  B.W., (1986). Density estimation for statistics and data analysis. Chapman and Hall.London

Tukey, J. (1960). The Practical Relationship between the Common Transformations of Percentages of Counts and of Amounts. Technical Report 36.Statistical Techniques Research Group. Princeton University.

Vafaee Najar, A., Karimi, I., Seydnowzadi, M. (2007).A comparative study between complementary health assurance structure and content in selected countries; and presenting a paradigm for Iran. Journal of Health Administration 28.57-72.

Yamada, Y., and Primbs, J. (2002). Value at Risk (VaR) Estimation for Dynamic Hedging. International Journal of Theoretical and Applied Finance 4.333-354.

 

 

 

 

Received on 09.12.2015               Modified on 24.12.2015

Accepted on 25.01.2016                                      © A&V Publications all right reserved

Asian J. Management; 7(1): Jan. –March, 2016 page 36-46

DOI: 10.5958/2321-5763.2016.00006.8