Hazard Group Mapping - Robertson Flashcards

1
Q

Why the NCCI moved from 17 limits to 5 for ELFs

A
  1. ELFs at any pair of excess limits are highly correlated across classes.
  2. Limits below $100,000 were heavily represented in the prior 17 limits.
  3. They wanted to cover the range of limits commonly used for retrospective rating.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

General excess ratio formula

A

R(L) = ExpectedLossesexcessofL / TotalExpectedLosses = 1 - LossEliminationRatio(L)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Normalized excess ratio function for injury type i

A

Si(r) = E [max ((Xi / mui) - r, 0)] = integral from r to infinity of (t-r) * gi(t)dt

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Injury type weighted excess ratios

A

Rj(L) = Σ (i of wi,j) * Si (L / mui,j)

Rc(L) = Σ (i of w i,c) * Si (L / mui,c)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Credibility weighted class excess ratio vectors

A

Rc final = zRc + (1 - z) Rj

z = min ((n / (n+k)) * 1.5, 1)

n = # of claims in the class

k = average # of claims per class

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Other credibility options considered

A
  1. Using the median instead of the average for k.
  2. Excluding Medical Only claims from the analysis.
  3. Including only Serious claims in the analysis.
  4. Requiring a minimum # of claims for classes used in the calculation of k.
  5. Various square root rules, such as z = sqrt (n / 384) , which corresponds to a 95% chance of n being within 10% of its expected value.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Measuring distance between vectors

A

The NCCI used the usual L2 or Euclidean distance, which they would have calculated for every pair of classes c of the Rc final vectors, measured as: ||x − y||2 = sqrt( ∑ (i=1 to n) (xi − yi)2. The NCCI also considered using the L1 distance of ||x − y||1 = ∑ (i=1 to n) |xi − yi|. This would have had the advantage in that it minimizes the relative error in estimating excess premium. The relative error in estimating excess premium for class c with limit L is PLR ∗|Rj(L) − Rc(L)|. Since the analysis was not sensitive to the distance measure, they chose to use the traditional squared distance measure.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Why the NCCI didn’t use standardization

A
  1. The resultant hazard groups using standardization didn’t differ much from not using it.
  2. Excess ratios at different limits have a similar unit of measure, which is dollars of excess loss per dollar of total loss. Standardization would have eliminated this common denominator.
  3. All excess ratios are between 0 and 1, while standardization could have led to results outside this range.
  4. There is a greater range of excess ratios at lower limits, and this is a good thing since it is based on actual data (compared to excess ratios at higher limits being based more on fitted loss distributions). Standardization would have given this real data less weight.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Weighted k-means algorithm

A
  1. Decide on the number of clusters (potential hazard groups) k to target.
  2. Start with an abritrary initial assignment of classes into k clusters (denoted as HG below).
  3. Compute the centroid of each cluster i for the k clusters as: Ri bar = ∑ from c’s in HGi of wc*Rc / ∑ from c’s in HGi of wc
  4. For each class, find the closest centroid using the L2 distance, and assign the class to that cluster. If any classes have been re-assigned in this step, go back to step 3. Continue this process until no classes are re-assigned.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Desirable optimality properties from k-means

A

K-means maximizes the equivalent of R2 from linear regression, which means maximizing the percentage of total variation explained by the hazard groups. This is equivalent to stating that k-means minimizes the within variance and maximizes the between variance, which means the hazard groups will be homogeneous and well separated.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Test statistics used to decide # of hazard groups

A
  1. The Calinski and Harabasz statistic of , Trace(B)/(k−1) / Trace(W)/(n−k) where n is the # of classes, k is the # of hazard groups. Higher values of this statistic indicate a better # of clusters. The test is also known as the Pseudo-F test since it resembles the F-test of regression analysis.
  2. The Cubic Clustering Criterion (CCC) statistic, which compares the amount of variance explained by a given set of clusters to that expected when clusters are formed at random based on the multi-dimensional uniform distribution. Again, a high value of this statistic indicates better performance.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Why the NCCI didn’t rely on the CCC test statistic showing 9 groups

A
  1. Milligan and Cooper found the Calinski and Harabasz statistic outperformed the CCC statistic.
  2. The CCC statistic deserves less weight when correlation is present, which was the case.
  3. The selection of # of hazard groups ought to be driven by the large classes where most of the experience was concentrated. Using these highly or fully credible classes showed 7 as the optimal number.
  4. There was crossover in the excess ratios between hazard groups when using 9 groups, which is something that isn’t appealing in practice.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Underwriter considerations

A
  1. Similarity between class codes that were in different groups.
  2. Degree of exposure to automobile accidents in a given class.
  3. Extent heavy machinery is used in a given class.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

3 key ideas of hazard group remapping

A
  1. Computing excess ratios by class.
  2. Sorting classes based on excess ratios.
  3. Cluster analysis.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly