STAT3058/6058 Tutorial 3 Solutions
1. A group one-year term-life insurance portfolio contains policies of two types. On one type of policy, a benefit
   of $5, 000 is paid if the policyholder dies within the year of coverage, while on the other a benefit of $10, 000
   is paid. Moreover, suppose that the probability of death during the year of coverage among those holding
   the first type of policy is 0.002, while for those holding the second type of policy the probability of death
   is 0.012. Suppose that the portfolio is composed of 500 type-1 and 80 type-2 policies, and that the policies
   are independent with respect to their chance of mortality.
   (a) Using the individual risk model, calculate the mean and variance of the aggregate claim amount, S,
       for this portfolio.
   (b) Using the normal approximation, estimate the probability that the aggregate claim amount over the
       year for this portfolio exceeds $11, 000.
    (c) Calculate the probability asked for in part (b) exactly. Do you think the normal approximation is
        reasonable here?
   (d) Approximate S using the collective risk model with an appropriate compound Poisson distributed
       quantity, S̃. Find the mean and variance of S̃. Is the approximation adequate?
    (e) Calculate Skew(S̃) and use the translated Gamma approximation to the distribution of S̃ to estimate
        the probability that the aggregate claim amount exceeds $11, 000.
     Solution:
      (a) We have 580 policies, with qi = 0.002, Xi = 5000 = µi for i = 1, . . . , 500 and qi = 0.012, Xi =
          10000 = µi for i = 501, . . . , 580. Further, we have σi2 = 0 for i = 1, . . . , 580. So,
                                        580
                                        X
                               E(S) =         qi µi = 500(0.002)(5000) + 80(0.012)(10000) = 14600
                                        i=1
          and
                         580
                         X
                               qi σi2 + (1 − qi ) µ2i = 500(0.002)(0.998) 50002 + 80(0.012)(0.988) 100002
                                                                                                        
                V(S) =
                         i=1
                     = 119798000
      (b) From the calculations of part (a), the normal approximation yields:
                                                               
                                                  11000 − 14600
                        P(S > 11000) ≈ 1 − Φ √                    = 1 − Φ(−0.329) = 0.6288
                                                     119798000
      (c) Note that the only way for S to be less than 11000 is if: (1) no claims are made, (2) exactly one
          type-1 claim is made, (3) exactly one type-2 claim is made, or (3) exactly two type-1 claims are
                                                            1
   made. Since the number of type-1 claims made is clearly binomial with parameters 500 and 0.002,
   while the number of type-2 claims made is binomial with parameters 80 and 0.012, we have:
        P(S > 11000) =1 − P(S ≤ 11000)
                       =1 − P(no claims) − P(1 type − 1 claim, no type − 2 claims )
                           − P(no type − 1 claims, 1 type − 2 claim )
                        − P(2 type − 1 claims, no type − 2 claims )
                       =1 − 0.998500 0.98880 − 500(0.002) 0.998499 0.98880
                                                                          
                                                       500(499)
                        − 80(0.012) 0.98879 0.998500 −              0.0022 0.998498 0.98880
                                                                                         
                                                              2
                       =0.5139.
   Note that the normal approximation is rather poor in this case.
                                                                                    P580
(d) The appropriate compound Poisson quantity S̃ has rate parameter Q =                i=1 qi   = 500(0.002) +
    80(0.012) = 1.96 and
                                   580
                                   X
               F (x) = (1.96)−1          qi Fi (x) = (1.96)−1 {500(0.002)F1 (x) + 80(0.012)F2 (x)}
                                   i=1
                     = (1.96)−1 {F1 (x) + 0.96F2 (x)}
   where F1 (x) is the CDF of a random variable which only takes the value 5000 and F2 (x) is the
   CDF of a random variable which only takes the value 10000 . A little thought shows that F (x) is
   the CDF of a random variable which takes the value 5000 with probability 1/1.96 and takes the
   value 10000 with probability 0.96/1.96. Therefore, µ1 = (1/1.96)5000+(0.96/1.96)10000 = 7448.98
   and
                             E(S̃) = Qµ1 = 1.96(7448.98) = 14600 = E(S).
   Also, we know that
                     580
                     X
                           qi2 µ2i = 119798000 + 500 0.0022       50002 + 80 0.0122 100002 = 121000000
                                                                                       
    V(S̃) = V(S) +
                     i=1
   Alternatively, we can calculate the second moment of a random variable with CDF F (x) as
   µ2 = (1/1.96)50002 + (0.96/1.96)100002 = 61734694, and thus V (S̃) = Qµ2 = 1.96(61734694) =
   121000000. Note that VV ((S)
                             S̃)
                                 ≈ 1.01, so we might expect that the approximation would be adequate.
(e) We first note that the third raw moment of a random variable with CDF F (x) is µ3 = (1/1.96)50003 +
    (0.96/1.96)100003 = 5.5357 × 1011 . Thus,
                                  Qµ3              µ3        5.5357 × 1011
                       ρS̄ =          3/2
                                            =    3/2 √   =            √     = 0.81517.
                                (Qµ2 )          µ2     Q   617346943/2 1.96
   So, the appropriate translated Gamma approximation has parameters which solve the equations:
                         2
                       √    = 0.81517, αg θg2 = 121000000, k + αg θg = 14600,
                         αg
   which have solutions αg = 6.019, θg = 4483.47 and k = −12388.02. Therefore,
                      P(S > 11000) ≈ P(S̃ > 11000)
                                         ≈ P(Y > 11000 + 12388.02)
                                         = P 2(4483.47)−1 Y > 2(4483.47)−1 (23388.02)
                                            
                                         ≈ P χ212 > 10.433
                                                                                                    = 0.5780
                                                     2
          where Y ∼ G(6.019, 4483.47). [NB: The actual value of P (Y > 11000 + 12388.02) is 0.5813.]
2. Mundo Insurance sold 500 fire insurance policies with the following characteristics:
    Policy Type    Number of Policies            Policy Maximum ($) (Mi )        Probability of Claim Per Policy (qi )
         A               200                               400                                   0.03
         B               300                               300                                   0.05
  You are given the following information:
     • Claim amounts for each policy, Xi , are uniformly distributed between 0 and policy maximum Mi .
     • The probability of more than one claim per policy is 0.
     • Claim occurrences are independent.
  Let S be the total claim amount on the entire portfolio of 500 claims.
  The insurer has set aside $10 for each of the 500 policies to pay for future claims. Therefore, the insurer
  has cash reserves with a total amount of $5, 000 available to pay claims.
   (a) Calculate E(S) and V (S).
   (b) Using the normal approximation, calculate the probability that S will exceed $5, 000.
   (c) Calculate the coefficient of skewness of S (find the exact answer, not an approximation). Given your
       answer for the coefficient of skewness, is the normal approximation a sensible approximation for S for
       this question? Justify your answer.
   (d) Using R, write code to simulate 100,000 values of S (i.e., simulate 100,000 portfolios, where each
       portfolio has 200 policies of type A and 300 policies of type B). Plot out the density of the simulated
       aggregate claims distribution, and use your simulations to estimate the probability that S will exceed
       $5, 000.
   (e) The insurer sells an additional number of policies of type B, so that the total number of policies of
       type B is greater than 300. They set aside cash reserves of $15 for each of the additional policies sold.
       Using the normal approximation, calculate how many additional policies of type B would need to be
       sold so that the probability that total claims (for all policies) exceed total cash reserves is less than
       1%.
     Solution:
     (a) Because policy claim amounts are uniformly distributed, the mean claim amount is Mi /2 and the
         variance is Mi2 /12. Therefore, the mean for policy type A and B respectively are: 200 and 150 ,
         and the variances are: 4002 /12 and 3002 /12.
                      500
                      X
            E(S) =          qi µi = 200{0.03(200)} + 300{0.05(150)} = 3, 450
                      i=1
                      500
                      X
                             qi σi2 + qi (1 − qi ) µ2i
                                                         
            V (S) =                                          since Xi ’s are independent
                      i=1
                               4002                                     3002
                                                                                             
                  = 200 0.03          + 0.03(1 − 0.03)2002 + 300 0.05          + 0.05(1 − 0.05)1502
                                12                                       12
                  = 312, 800 + 433, 125
                  = 745, 925.
                                                                 3
                                         
                              5000−3450
(b) P (S > 5000) = P Z >       √
                                 745925
                                              = P (Z > 1.7947) = 1 − P (Z ≤ 1.7947) = 0.0363
(c) Find the coefficient of skewness of the distribution of S by using the skewness formula for S when
    claims are binomially distributed. In other words, find skewness for policy type A and B and add
    together (because of independence).
      From lecture slides for compound binomial: Skew (Si ) = mqµ3 − 3mq 2 µ2 µ1 + 2mq 3 µ31 where
      m is the number of policies (or the number of trials). µ1 is the mean
                                                                           of the uniform = Mi /2.
      µ2 = V (X) + E(X)2 = Mi2 /12 + Mi2 /4 = Mi2 /3. µ3 = 0.25 Mi3 (you could find µ3 from the
      MGF for uniform or from an online source).
      For portfolio A:
                                          µ1 = 400/2 = 200
                                          µ2 = 4002 /3 = 53333.33
                                          µ3 = 0.25 4003 = 1.6e + 07.
                                                         
      For portfolio B:
                                          µ1 = 300/2 = 150
                                          µ2 = 3002 /3 = 30, 000
                                          µ3 = 0.25 3003 = 6, 750, 000.
                                                         
                          Skew (S1 ) = mqµ3 − 3mq 2 µ2 µ1 + 2mq 3 µ31
                                     = 200 0.03µ3 − 3 0.032 µ2 µ1 + 2 0.033 µ31                                                                          
                                     = 200[480, 000 − 28, 800 + 432]
                                     = 90326400
                          Skew (S2 ) = mqµ3 − 3mq 2 µ2 µ1 + 2mq 3 µ31
                                     = 300 0.05µ3 − 3 0.052 µ2 µ1 + 2 0.053 µ31                                                                          
                                     = 300[337, 500 − 33, 750 + 843.75]
                                     = 91378125
                           Skew(S) = 90326400 + 91378125 = 181704525
                                                                      Skew(S)/ V (S)3/2 = 181704525/ 7459251.5 = 0.28                                                                     
      Normal appears to be sensible approximation because the coefficient of skewness is less than 0.5.
(d)
                                                      4
     (e) Let n be the additional number of policies of Type B. We need to find n such that:
                                             P (S > 5000 + 15n) < 0.01
         First find the updated E(S) and V (S) which include n:
                         E(S) = 200{0.03(200)} + (300 + n){0.05(150)} = 3, 450 + 7.5n
                            4002                                            3002
                                                                                                  
                                                      2                                                2
         V (S) = 200 0.03          + 0.03(1 − 0.03)200 + (300 + n) 0.05            + 0.05(1 − 0.05)150
                             12                                              12
               = 745, 925 + 1443.75n
         We need to find n such that: P (S > 5000 + 15n) < 0.01
                                                                                
                              5000 + 15n − 3450 − 7.5n             1550 + 7.5n
                      P Z>       √                       =P Z> √
                                   745925 + 1443.75n             745925 + 1443.75n
         For the standard normal distribution, we know: P (Z ≤ 2.326) = 0.99
         Therefore, solve:
                                                 1550 + 7.5n
                                            √                   = 2.326
                                              745925 + 1443.75n
                                                          √
                                       1550 + 7.5n = 2.326 745925 + 1443.75n
                                    (1550 + 7.5n)2 = 2.3262 (745925 + 1443.75n)
                                 15502 + 23250n + 56.25n2 = 4, 035, 660 + 7811.09n
                                        56.25n2 + 15438.91n − 1, 633, 160 = 0
         Solve the quadratic and keep the positive root: n = 81.55. Therefore, we obtain n = 82 (note we
         want to round up so that the probability exceeds 0.99).
3. R based question - continued from tutorial 2 question 4.
  The annual aggregate claim amount of an insurer follows a compound Poisson distribution with parameter 5.
  Individual claim amounts follow a Gamma distribution with shape parameter α = 1.5 and scale parameter
  θ = 0.25.
   (a) Find the mean, standard deviation and the standardised coefficient of skewness of the aggregate claims.
   (b) Generate 20,000 simulated aggregate claim values for this insurer, using a random number generator
       seed of 825.
       In your answer script, copy and paste the code used to generate the claim amounts and use the R
       function, head ( ), to display the first seven simulated claim values
   (c) Find the mean, standard deviation and the standardised coefficient of skewness of your generated data.
       Compare this to your answer in part (a) above.
       In your answer script, copy and paste the code used to generate the values as well as the required
       quantities.
   (d) Plot the empirical density function of the simulated aggregate claim values from part (b), setting the
       x-axis range from 0 to 10 and the y-axis range from 0 to 0.45.
       In your answer script, copy and paste the code used to generate the plot as well as the plot.
   (e) Suppose you want to approximate the simulated claims using a normal approximation. State the
       parameters of you will use in the approximation.
                                                     5
(f) Generate 20,000 values from the normal distribution of part (e) using a random number generator seed
    of 825.
    In your answer script, copy and paste the code and use the R function, head ( ), to display the first
    seven simulated claim values
(g) Plot the empirical density function of the simulated values in part (f) as a different coloured line in
    the chart that was produced in part (d). Comment on the graph.
    In your answer script, copy and paste the code used to generate the plot as well as the plot.
(h) Now suppose that you want to approximate the simulated claims using a translated gamma approxi-
    mation. State the parameters of you will use in the approximation.
(i) Generate 20,000 values from the gamma distribution of part (h) using a random number generator seed
    of 825.
    In your answer script, copy and paste the code and use the R function, head ( ), to display the first
    seven simulated claim values
(j) Plot the empirical density function of the simulated values in part (i) as a different coloured line in the
    chart that was produced in part (d). Comment on the graph.
    In your answer script, copy and paste the code used to generate the plot as well as the plot.
(k) Estimate the 95th , 97.5th and 99.5th percentiles of the simulated claims data, the normal approximation
    and the translated gamma approximation. Comment on the values.
    In your answer script, copy and paste the code used to generate the values as well as the required
    quantities.
(l) Based on your answers from parts (e) to (k), comment on the two forms of approximations.
 Solution:
  (a) We have:
                                         E(S) = 5 × 1.5 × 0.25 = 1.875
                                     V (S) = 5 × 1.5 × 2.5 × 0.252 = 1.1719
                                                   p
                                            σ(S) = V (S) = 1.0825
                                                       3.5
                                        ρ(S) = √                 = 0.8083
                                                   5 × 1.5 × 2.5
  (b) set.seed(825)
      n <- rpois(20000,lambda)
      s <- numeric(20000)
      for(i in 1:20000)
      x <- rgamma(n[i],shape=alpha,scale=theta)
      s[i] <- sum(x)
      round(head(s, 7),4)
      [1] 3.5761 3.1328 0.7318 2.0129 1.0648 2.7112 0.4009
  (c) mu2 = mean(s)
      sigma2 = sd(s)
      skew2 = sum((s-mean(s))**3)/length(s)
      rho2 = (skew2)/(sigma2**3)
      mu2
      [1] 1.871006
                                                    6
   sigma2
   [1] 1.077492
   rho2
   [1] 0.7918806
   round(c(mu2,sigma2,rho2),4)
   [1] 1.8710 1.0775 0.7919
   The mean, standard deviation and skewness of the simulated claims are close to the theoretical
   values, as expected (20,000 simulated observation is typically considered ‘enough’).
(d) plot(density(s),xlim=c(0,10),ylim=c(0,0.45), xlab = "Simulated Claims",
    main="PDF of Simulated Claims from a Compound Poisson Distribution")
(e) To approximate the simulated claims using a normal approximation, we have N (µ = 1.8710, σ =
    1.0775).
(f) set.seed(825)
    approxdist = rnorm(20000, mean(s), sd(s))
    round(head(approxdist, 7),4)
    [1] 3.9386 1.2521 1.1035 0.9349 2.0339 1.1471 1.8355
(g) plot(density(s),main="Simulated data versus normal approximation",
    xlim=c(0,10),ylim=c(0,0.45),xlab="Aggregate claim size",col="blue")
    lines(density(approxdist), col = "red")
                                             7
   In the figure above, blue line is original data and red line is normal approximation.
   Several differences can be spotted in this plot:
     • The peaks in their densities are clearly different.
     • The normal distribution generates negative aggregate claim sizes, while the simulated data
       distribution does not have negative aggregate claim sizes.
     • The normal distribution is symmetric while the simulated data distribution is skewed.
     • The normal approximation has a more quickly decaying tail than the simulated data distri-
       bution.
(h) alpha3 = (2/rho2)**2
    theta3 = sigma2/(sqrt(alpha3))
    k3 = mu2 - alpha3*theta3
    alpha3
    [1] 6.378823
    theta3
    [1] 0.4266224
    k3
    [1] -0.8503427
    round(c(alpha3,theta3,k3),4)
    [1] 6.3788 0.4266 -0.8503
(i) set.seed(825)
    Trans.gamma<-rgamma(20000, shape=alpha3, scale=theta3)+k3
    round(head(Trans.gamma, 7),4)
    [1] 4.0353 1.0988 1.8721 1.8291 0.6443 1.6237 2.2132
(j) plot(density(s),main="Simulated data versus translated gamma approximation",
    xlim=c(0,10),ylim=c(0,0.45),xlab="Aggregate claim size",col="blue")
    lines(density(Trans.gamma), col = "red")
                                                8
   In the figure above, blue line is original data and red line is translated gamma approximation.
   Several findings can be seen in this plot:
      • The translated gamma approximation seems to fit much closer to the simulated data distri-
        bution than the normal approximation.
      • But it still yields negative aggregate claim sizes.
      • There is some deviation around aggregate claim size of 2.
      • The peak is very slightly different.
(k) newmat = cbind(s, approxdist, Trans.gamma)
    probs = c(0.95,0.975, 0.995)
    quant = apply(newmat, 2, quantile, probs = probs)
    quant
                                    Simulation       Normal   Trans Gamma
                            95%       3.8546         3.6325      3.8788
                           97.5%      4.3368         3.9794      4.3745
                           99.5%      5.3783         4.6470      5.4908
   The approximations are usually close to the original data but the error increases for higher per-
   centiles. Also, the translated Gamma distribution provides a better approximation than the Normal
   distribution.
(l) Overall, as seen from the graphical comparisons as well as those of the percentiles, the translated
    gamma distribution provides a better approximation than normal distribution.
   This is largely due to the fact that the simulated data has a coefficient of skewness of 0.79. This
   value is >0.5, and we know that the normal approximation works best when the coefficient of
   skewness is <0.5.