Midterm2-Bokang Li
1.
a.
num=[AGE,VEHESTSPD,AADT,HIGHESTINJ,SEVINDX];
Mean=mean(num)
StandradDeviation=std(num)
Variance=var(num)
Mode=mode(num)
Maximum=max(num)
Minimum=min(num)
Range=range(num)
Q1=prctile(num,25)
Q2=prctile(num,50)
Q3=prctile(num,75)
 Midterm2Bokang
         Li              AGE       VEHESTSPD       AADT      HIGHESTINJ     SEVINDX
       mean              44.36         26.6        23052.8       2.12      126.7069379
 standard deviation    20.750816   15.72940372   17983.81079 1.037479463   115.7237954
      variance        430.596364   247.4141414   323417450.7 1.076363636   13391.99681
       mode                28           15          27500          1       11.56522326
     maximum               85           75          94500          4       784.5617203
     minimum               16            5            700          1       11.56522326
       range               69           70          93800          3        772.996497
    1st quantile           25           15           8500          1       52.63604541
    2nd quantile           42           25          20750          2       90.85236987
    3rd quantile          61.5          35          31000          3        158.383865
b.
>> newsheet17=sortrows(sheet17,3)
>> younger=newsheet17(1:26,:)
>> midage=newsheet17(27:78,:)
>> older=newsheet17(79:100,:)
>> sevindxyounger=younger(:,14)
>> sevindxmidage=midage(:,14)
>> sevindxolder=older(:,14)
>> SEVINDXYOUNGER=sevindxyounger{:,1};
>> SEVINDXMIDAGE=sevindxmidage{:,1};
>> SEVINDXOLDER=sevindxolder{:,1};
>> subplot(1,3,1);
boxplot(SEVINDXYOUNGER);
title('SEVINDX25andyounger');
xlabel('25 and younger');
subplot(1,3,2);
boxplot(SEVINDXMIDAGE);
title('SEVINDX25to65');
xlabel('25 to 65');
subplot(1,3,3);
boxplot(SEVINDXOLDER);
title('SEVINDX65andolder');
xlabel('65 and older')
c.
>> youngerdrvaction=younger.DRVACTION;
>> midagedrvaction=midage.DRVACTION;
>> olderdrvaction=older.DRVACTION;
>> valueyounger=tabulate(youngerdrvaction)
valueyounger =
  5×3 cell array
    {'Careless'}    {[13]}     {[       50]}
    {'Close'    }    {[ 1]}    {[ 3.8462]}
    {'OTHERS' }         {[ 5]}     {[19.2308]}
    {'RUN'        }    {[ 4]}     {[15.3846]}
    {'RoW'        }    {[ 3]}     {[11.5385]}
>> valuemidage=tabulate(midagedrvaction)
valuemidage =
  5×3 cell array
    {'Careless'}    {[24]}     {[46.1538]}
    {'Close'    }    {[ 3]}    {[ 5.7692]}
    {'OTHERS' }         {[ 8]}    {[15.3846]}
    {'RUN'        }    {[ 6]}    {[11.5385]}
    {'RoW'        }    {[11]}     {[21.1538]}
>> valueolder=tabulate(olderdrvaction)
valueolder =
  5×3 cell array
    {'Careless'}    {[5]}     {[22.7273]}
    {'Close'    }    {[1]}     {[ 4.5455]}
    {'OTHERS' }         {[5]}     {[22.7273]}
    {'RUN'        }    {[7]}     {[31.8182]}
    {'RoW'        }    {[4]}     {[18.1818]}
>> X = categorical({'Careless','Close','OTHERS','RUN','RoW'});
>> careless=[13;24;5];
>> close=[1;3;1];
>> others=[5;8;5];
>> run=[4;6;7];
>> row=[3;11;4];
>> Y=[careless,close,others,run,row];
>> bar(X,Y)
legend('25 and younger','between 25 and 65','65 and older')
d.
 highest             Crash               Driver    Vehicle      Driver
 “(SEVINDX)”         ID      SEVINDX     age       Speed        Action
 1st highest SEVINDX       2 784.5617203        34           55 'OTHERS'
 2nd highest
 SEVINDX                 84 510.4166109         69           55   'RUN'
 3rd highest SEVINDX     86 449.6003925         67           25   'RUN'
 4th highest SEVINDX     67 384.8226219         61           35   'OTHERS'
 5th highest SEVINDX     92 373.5032851         50           35   'Careless'
>> newsheet17d=sortrows(sheet17,14,'descend')
>> highestsevindx=newsheet17d(1:5,[1 14 3 5 2])
e.
>> correlation1 =corrcoef(AGE, SEVINDX);
correlation2= corrcoef(VEHESTSPD, SEVINDX);
correlation3= corrcoef(AADT,SEVINDX);
correlation4= corrcoef(HIGHESTINJ,SEVINDX);
Correlation_Coefficients = {'AGE'; 'VEHESTSPD'; 'AADT'; 'HIGHESTINJ'};
SEVINDX=[correlation1(2,1);correlation2(2,1);correlation3(2,1);correlation4(2,1)];
>> A=table(Correlation_Coefficients, SEVINDX)
A=
     4×2 table
       Correlation_Coefficients SEVINDX
       ________________________     ________
             {'AGE'      }                0.3591
             {'VEHESTSPD' }               0.48412
             {'AADT'      }             -0.40378
             {'HIGHESTINJ'}                0.1499
>> Finaltable=sortrows(A,2,'descend')
Finaltable =
     4×2 table
       Correlation_Coefficients SEVINDX
       ________________________     ________
             {'VEHESTSPD' }               0.48412
             {'AGE'      }                0.3591
             {'HIGHESTINJ'}                0.1499
             {'AADT'      }             -0.40378
f.
>> subplot(2,2,1);
scatter(AGE,SEVINDX,'filled','b');
title('Correlation Value:0.3591');
xlabel('Diver Age');
ylabel('Severity Index');
>> grid on;
>> subplot(2,2,2);
scatter(VEHESTSPD,SEVINDX,'filled','m');
title('Correlation Value:0.48412');
xlabel('Vehicle Speed');
ylabel('Severity Index');
>> grid on;
>> subplot(2,2,3);
scatter(AADT,SEVINDX,'filled','k');
title('Correlation Value:-0.40378');
xlabel('Average Annual Daily Traffic');
ylabel('Severity Index');
>> grid on;
>> subplot(2,2,4);
scatter(HIGHESTINJ,SEVINDX,'filled','y');
title('Correlation Value:0.1499');
xlabel('Highest Level of Injury');
ylabel('Severity Index');
>> grid on;
g.
SEVINDX vs AGE
Degree 1
SEVINDX vs AGE
Degree 2
SEVINDX vs AGE
Degree 3
SEVINDX vs VEHESTSPD
Degree 1
SEVINDX vs VEHESTSPD
Degree 2
SEVINDX vs VEHESTSPD
Degree 3
SEVINDX vs AADT
Degree 1
SEVINDX vs AADT
Degree 2
SEVINDX vs AADT
Degree 3
SEVINDX vs HIGHESTINJ
Degree 1
SEVINDX vs HIGHESTINJ
Degree 2
SEVINDX vs HIGHESTINJ
Degree 3
Findings:
The polynomials with degrees of 3 generally fitted better with the datasets, as these
curves are more flexible than the polynomials with degrees of 1 and 2, as straight lines
are not feasible for fitting complex and non-liner data sets.
According to results obtained in b, we can conclude that the severity Index increases as
the age of the driver increases.
According to results obtained in c, we can conclude that for drivers younger than or
equal to 25, careless is the most important reason for crashes, and following too closely
was the least important reason. For drivers with an age between 25 and 65, careless is
the most important reason for crashes, and following too closely was the least important
reason, while failing to yield the right- of -way is a remarkably important reason as well.
And for drivers older than 65, the crashes due to being careless was clearly reduced,
and following too closely was the least important reason, however, the percentage of
crashes due t following too closely, running red light/stop sign and others increased
significantly.
According to results obtained in d, we can conclude that a majority percentage of the
severe crashes are caused by aged drivers and running traffic light of stop sign is a very
important reason for crashes with greater severity Indexes. And high-speed driving
could lead to crashes with high severity Indexes as well.
According to the results obtained in e, we can conclude that the severity Index is clearly
related to average annual daily traffic, vehicle’s speed at the time of the crash (mph)
and at-fault driver’s age, while the highest injury severity that has occurred because of
the crash was slightly related.
I think that the speed limitations of the roads should be decreased on days with strong
daylight to a smaller speed to prevent from more severe crashes. The drivers with an
age between 25 and 65 should be encouraged to yield the right-of -way to stay safe, and
younger drivers should be encouraged to be more careful while driving, and the elderly
drivers should be encouraged to avoid driving on days with strong sunlight and drive
under better light conditions.
AADT:
The traffic capacities on days with strong daylights should be regulated and more time
should be added to the red lights, while the time for green light should be reduced.
DRVACTION:
Drivers should be assigned more study assignments for improving their driving safety
awareness. Unsafety behaviors should be recorded and fined to regulate Irregular,
unsafe driving habits.
AGE:
Elderly drivers should be encouraged to drive less under bad light conditions and keep
at lower speed while driving.
VEHESTSPD:
The speed of vehicles on the roads should be limited to different values under different
weather conditions.
2.
a.
Comments:
For all the results obtained from the hypothesis tests, the null hypothesis was proved to
be true, and no error occurred. All the obtained means do not differ from their preset
H0, as all the Ps equal to 0. So, there is no evidence that the means differs from the H0.
>> load carbig
Horsepower=Horsepower(~isnan(Horsepower));
MPG=MPG(~isnan(MPG));
meanacceleration=15.5197;
meancylinders=5.4754;
meandisplacement=194.7796;
meanhorsepower=105.0825;
meanweight=2979.4138;
meanmpg=23.5146;
stdacceleration=2.8034;
stdcylinders=1.7122;
stddisplacement=104.9225;
stdhorsepower=38.7688;
stdweight=847.0043;
stdmpg=7.816;
>> [h1,p1,ci1,zval1] = ztest(Acceleration,meanacceleration,stdacceleration,0.05)
h1 =
       0
p1 =
    1.0000
ci1 =
    15.2470
    15.7924
zval1 =
    3.1866e-05
>> [h2,p2,ci2,zval2] = ztest(Cylinders,meancylinders,stdcylinders,0.05)
h2 =
       0
p2 =
      0.9997
ci2 =
      5.3088
      5.6419
zval2 =
   -3.5942e-04
>> [h3,p3,ci3,zval3] = ztest(Displacement,meandisplacement,stddisplacement,0.05)
h3 =
       0
p3 =
      1.0000
ci3 =
   184.5736
   204.9855
zval3 =
   -8.3249e-06
>> [h4,p4,ci4,zval4] = ztest(Horsepower,meanhorsepower,stdhorsepower,0.05)
h4 =
       0
p4 =
       1
ci4 =
   101.2832
   108.8818
zval4 =
       0
>> [h5,p5,ci5,zval5] = ztest(Weight,meanweight,stdweight,0.05)
h5 =
       0
p5 =
      1.0000
ci5 =
    1.0e+03 *
      2.8970
      3.0618
zval5 =
   -1.6406e-07
>> [h6,p6,ci6,zval6] = ztest(MPG,meanmpg,stdmpg,0.05)
h6 =
       0
p6 =
      0.9999
ci6 =
    22.7467
    24.2824
zval6 =
 -6.9262e-05
b.
Normal fit is not applicable.
1.1.acceleration linear
SSE=2618.3
MSE= 6.6286
RMSE=2.5746
R2=0.1328
Adjusted R2=0.1284
The linear curve did not fit well for the dataset, as the R2, and adjusted R2 were too
small, only about 0.12 or 0.13.
1.2.acceleration polynomials
SSE=2485.8
MSE= 6.27753025
RMSE=2.5055
R2=0.1766
Adjusted R2=0.17456
The polynomial fitting with a degree of 1 did not fit well for the dataset, as the R2, and
adjusted R2 were too small, only about 0.17 or 0.18.
SSE=2414.5
MSE= 6.1128
RMSE=2.4724
R2=0.2003
Adjusted R2=0.1962
The polynomial fitting with a degree of 2 did not fit well for the dataset, as the R2, and
adjusted R2 were too small, only about 0.20 or 0.19.
SSE=2315.6
MSE= 5.8772
RMSE=2.4243
R2=0.2330
Adjusted R2=0.2272
The polynomial fitting with a degree of 3 did not fit well for the dataset, as the R2, and
adjusted R2 were too small, only about 0.23 or 0.22.
1.3acceleration exponential
SSE=2501.9
MSE= 6.3182
RMSE=2.5136
R2=0.1713
Adjusted R2=0.1692
The exponential fitting did not fit well for the dataset, as the R2, and adjusted R2 were
too small, only about 0.17.
2.1cylinders linear
SSE=671.61
MSE= 1.700
RMSE=1.3039
R2=0.4153
Adjusted R2=0.4124
The linear fitting did not fit well for the dataset, as the R2, and adjusted R2 were too
small, only about 0.41.
2.2cylinders polynomials
SSE=458.05
MSE= 1.1567
RMSE=1.0755
R2=0.6012
Adjusted R2=0.6002
The polynomial fitting with a degree of 1 did not fit well for the dataset, as the R2, and
adjusted R2 were too small, only about 0.6.
SSE=299.01
MSE= 0.7570
RMSE=0.8701
R2=0.7397
Adjusted R2=0.7384
The polynomial fitting with a degree of 2 fitted better than the former two fittings,
however still not fitted very well for the dataset, as the R2, and adjusted R2 were not
good enough, about 0.74.
SSE=298.49
MSE= 0.7576
RMSE=0.8704
R2=0.7402
Adjusted R2=0.7382
The polynomial fitting with a degree of 3 fitted better than the former two fittings,
however still not fitted very well for the dataset, as the R2, and adjusted R2 were not
good enough, about 0.74.
2.3cylinders exponential
SSE=372.8
MSE= 0.7576
RMSE=0.9703
R2=0.6755
Adjusted R2=0.6746
The exponential fitting did not fit well for the dataset, as the R2, and adjusted R2 were
too small, only about 0.68.
3.1displacement linear
SSE= 2.347e+06
MSE= 5942.5597
RMSE=77.088
R2=0.4562
Adjusted R2=0.45342
The linear fitting did not fit well for the dataset, as the R2, and adjusted R2 were too
small, only about 0.45.
3.2displacement polynomials
SSE= 1.525e+06
MSE= 3850.3266
RMSE=62.051
R2=0.6467
Adjusted R2=0.6459
The polynomial fitting with a degree of 1 did not fit well for the dataset, as the R2, and
adjusted R2 were too small, only about 0.65.
SSE: 9.609e+05
R-square: 0.7774
Adjusted R-square: 0.7762
RMSE: 49.32
MSE: 2432.4624
The polynomial fitting with a degree of 2 fits better than the former ones, but still not
good enough, as the R2, and adjusted R2 were about 0.77.
SSE: 9.521e+05
R-square: 0.7794
Adjusted R-square: 0.7777
RMSE: 49.16
MSE: 2416.4106
The polynomial fitting with a degree of 3 fits better than the former ones, kind of good
in fact, as the R2, and adjusted R2 were about 0.78.
3.3displacement exponential
Goodness of fit:
   SSE: 1.043e+06
   R-square: 0.7584
   Adjusted R-square: 0.7578
   RMSE: 51.32
   MSE: 2633.7424
The exponential fitting fits better than some of the fits, but still not good enough, as the
R2, and adjusted R2 were about 0.77.
4.1horsepower linear
Goodness of fit:
   SSE: 3.295e+05
   R-square: 0.4312
   Adjusted R-square: 0.4283
   RMSE: 29.1
   MSE: 846.81
The linear fitting did not fit well for the dataset, as the R2, and adjusted R2 were about
0.43.
4.2horsepower polynomials
Goodness of fit:
   SSE: 2.283e+05
   R-square: 0.6059
   Adjusted R-square: 0.6049
   RMSE: 24.19
   MSE: 585.1561
The polynomial fitting with a degree of 1 did not fit well for the dataset, as the R-square
was only 0.6059 and Adjusted R-square was only 0.6049.
Goodness of fit:
  SSE: 1.634e+05
  R-square: 0.718
  Adjusted R-square: 0.7165
  RMSE: 20.49
  MSE: 419.8401
The polynomial fitting with a degree of 2 did not fit well for the dataset, as the R-square
was only0.718 and Adjusted R-square was only 0.7165.
Goodness of fit:
  SSE: 1.505e+05
  R-square: 0.7403
  Adjusted R-square: 0.7382
  RMSE: 19.69
  MSE: 387.6961
  The polynomial fitting with a degree of 3 did not fit well for the dataset, as the R-
square was only 0.7403 and Adjusted R-square was only 0.7382.
4.3horsepower exponential
Goodness of fit:
   SSE: 1.818e+05
   R-square: 0.6861
   Adjusted R-square: 0.6853
   RMSE: 21.59
   MSE: 466.1281
The exponential fitting did not fit well for the dataset, as the R-square was only 0.6861
and Adjusted R-square was only 0.6853.
5.1weight linear
Goodness of fit:
   SSE: 1.414e+08
   R-square: 0.5032
   Adjusted R-square: 0.5007
   RMSE: 598.4
   MSE: 358082.56
The linear fitting did not fit well for the dataset, as the R-square was only 0.5032 and
Adjusted R-square was only 0.5007.
5.2weight polynomials
Goodness of fit:
   SSE: 8.775e+07
   R-square: 0.6918
   Adjusted R-square: 0.691
   RMSE: 470.7
   MSE: 221,558.49
The polynomial fitting with a degree of 1 did not fit well for the dataset, as the R-square
was only 0.6918 and Adjusted R-square was only 0.691.
Goodness of fit:
  SSE: 5.754e+07
  R-square: 0.7979
  Adjusted R-square: 0.7969
  RMSE: 381.7
  MSE: 145694.89
The polynomial fitting of a degree of 2 fitted kind of good for the dataset, as the R-
square was 0.7969 and the Adjusted R-square was 0.7969.
Goodness of fit:
  SSE: 5.703e+07
  R-square: 0.7997
  Adjusted R-square: 0.7982
  RMSE: 380.5
  MSE: 144780.25
The polynomial fitting of a degree of 3 fitted kind of good for the dataset, as the R-
square was 0.7997 and the Adjusted R-square was 0.7982.
5.3weight exponential
Goodness of fit:
   SSE: 6.917e+07
   R-square: 0.7571
   Adjusted R-square: 0.7564
   RMSE: 417.9
   MSE: 174640.41
The exponential fitting did not fit good enough for the dataset, as the R-square was
0.7571 and the Adjusted R-square was 0.7564.
c.
>> subplot(3,2,1);
plot(Model_Year,Acceleration,':.r');
title('Acceleration-Time')
xlabel('Time(year)')
ylabel('Acceleration')
subplot(3,2,2);
plot(Model_Year,Cylinders,':.r');
title('Cylinders-Time');
xlabel('Time(year)');
ylabel('Cylinders');
subplot(3,2,3);
plot(Model_Year,Displacement,':.r');
title('Displacement-Time');
xlabel('Time(year)');
  ylabel('Displacement');
subplot(3,2,4);
plot(Model_Year,Horsepower,':.r');
title('Horsepower-Time');
xlabel('Time(year)');
ylabel('Horsepower');
subplot(3,2,5);
plot(Model_Year,Weight,':.r');
title('Weight-Time');
xlabel('Time(year)');
ylabel('Weight');
subplot(3,2,6);
plot(Model_Year,MPG,':.r');
title('MPG-Time');
xlabel('Time(year)');
>> ylabel('MPG');
Averagely, the Acceleration increased slightly as time goes by; and cylinder changed
slightly with no clear pattern; displacement decreased as time went from 70 to 82;
horsepower decreased as time went from 70 to 82; weight decreased as time went from
70 to 82; MPG increased to better values as time went from 70 to 82.
d.
(a).
Comments:
For all the results obtained from the hypothesis tests, the null hypothesis was proved to
be true, and no error occurred. All the obtained means do not differ from their preset
H0, as all the Ps equal to 0. So, there is no evidence that the means differs from the H0.
>> load carsmall
>> Horsepower=Horsepower(~isnan(Horsepower));
MPG=MPG(~isnan(MPG));
>> meanacceleration=15.028;
>> meancylinders=5.62;
>> meandisplacement=207.6;
>> meanhorsepower=112.0404;
>> meanweight=3011.83;
>> meanmpg=23.7181;
>> stdacceleration=3.3485;
>> stdcylinders=1.791;
>> stddisplacement=111.7745;
>> stdhorsepower=45.5268;
>> stdweight=806.9624;
>> stdmpg=8.0357;
>> [h1,p1,ci1,zval1] = ztest(Acceleration,meanacceleration,stdacceleration,0.05)
h1 =
       0
p1 =
      1.0000
ci1 =
    14.3717
    15.6843
zval1 =
   -1.0610e-14
>> [h2,p2,ci2,zval2] = ztest(Cylinders,meancylinders,stdcylinders,0.05)
h2 =
       0
p2 =
       1
ci2 =
      5.2690
      5.9710
zval2 =
       0
>> [h3,p3,ci3,zval3] = ztest(Displacement,meandisplacement,stddisplacement,0.05)
h3 =
       0
p3 =
       1
ci3 =
   185.6926
   229.5074
zval3 =
       0
>> [h4,p4,ci4,zval4] = ztest(Horsepower,meanhorsepower,stdhorsepower,0.05)
h4 =
       0
p4 =
      1.0000
ci4 =
   103.0724
   121.0084
zval4 =
    8.8303e-07
>> [h5,p5,ci5,zval5] = ztest(Weight,meanweight,stdweight,0.05)
h5 =
       0
p5 =
       1
ci5 =
    1.0e+03 *
      2.8537
      3.1700
zval5 =
       0
>> [h6,p6,ci6,zval6] = ztest(MPG,meanmpg,stdmpg,0.05)
h6 =
       0
p6 =
      1.0000
ci6 =
    22.0936
    25.3425
zval6 =
   -1.7970e-05
(b).
Normal fit is not applicable.
1.1.acceleration linear
Goodness of fit:
   SSE: 803.7
   R-square: 0.1837
   Adjusted R-square: 0.1658
   RMSE: 2.972
   MSE: 8.8328
The linear curve did not fit well for the dataset, as the R2, and adjusted R2 were too
small, only about 0.18.
1.2.acceleration polynomials
Goodness of fit:
   SSE: 780.8
   R-square: 0.207
   Adjusted R-square: 0.1983
   RMSE: 2.913
   MSE= 8.485569
The polynomial fitting with a degree of 1 did not fit well for the dataset, as the R2, and
adjusted R2 were too small, only about 0.2.
Goodness of fit:
  SSE: 765.5
  R-square: 0.2224
  Adjusted R-square: 0.2054
  RMSE: 2.9
  MSE= 8.41
The polynomial fitting with a degree of 2 did not fit well for the dataset, as the R2, and
adjusted R2 were too small, only about 0.20 or 0.22.
Goodness of fit:
  SSE: 761.8
  R-square: 0.2263
  Adjusted R-square: 0.2005
  RMSE: 2.909
  MSE= 8.462281
The polynomial fitting with a degree of 3 did not fit well for the dataset, as the R2, and
adjusted R2 were too small, only about 0.20 or 0.22.
1.3acceleration exponential
Goodness of fit:
   SSE: 786.8
   R-square: 0.2008
   Adjusted R-square: 0.1922
   RMSE: 2.924
   MSE= 8.549776
The exponential fitting did not fit well for the dataset, as the R2, and adjusted R2 were
too small, only about 0.20.
2.1cylinders linear
Goodness of fit:
   SSE: 146
   R-square: 0.4886
   Adjusted R-square: 0.4774
   RMSE: 1.267
   MSE= 1.605289
The linear fitting did not fit well for the dataset, as the R2, and adjusted R2 were too
small, only about 0.48.
2.2cylinders polynomials
Goodness of fit:
   SSE: 87.66
   R-square: 0.693
   Adjusted R-square: 0.6896
   RMSE: 0.9761
   MSE= 0.95277
The polynomial fitting with a degree of 1 did not fit well for the dataset, as the R2, and
adjusted R2 were too small, only about 0.69.
Goodness of fit:
  SSE: 52.11
  R-square: 0.8175
  Adjusted R-square: 0.8135
  RMSE: 0.7567
  MSE= 0.57259
The polynomial fitting with a degree of 2 fitted better than the former two fittings,
however still not fitted very well for the dataset, as the R2, and adjusted R2 were not
good enough, about 0.81.
 Goodness of fit:
  SSE: 48.34
  R-square: 0.8307
  Adjusted R-square: 0.825
  RMSE: 0.7329
  MSE= 0.53714241
The polynomial fitting with a degree of 3 fitted better than the former two fittings,
however still not fitted very well for the dataset, as the R2, and adjusted R2 were not
good enough, about 0.74.
2.3cylinders exponential
Goodness of fit:
   SSE: 67.73
   R-square: 0.7628
   Adjusted R-square: 0.7602
   RMSE: 0.858
   MSE= 0.736164
The exponential fitting did not fit well for the dataset, not good enough as the R2, and
adjusted R2 were about 0.76.
3.1displacement linear
Goodness of fit:
   SSE: 5.953e+05
   R-square: 0.47
   Adjusted R-square: 0.4584
   RMSE: 80.88
   MSE= 5942.5597
The linear fitting did not fit well for the dataset, as the R2, and adjusted R2 were too
small, only about 0.47.
3.2displacement polynomials
Goodness of fit:
   SSE: 3.983e+05
   R-square: 0.6454
   Adjusted R-square: 0.6416
   RMSE: 65.8
   MSE: 4329.64
The polynomial fitting with a degree of 1 did not fit well for the dataset, as the R2, and
adjusted R2 were too small, only about 0.65.
Goodness of fit:
  SSE: 2.798e+05
  R-square: 0.7509
  Adjusted R-square: 0.7454
  RMSE: 55.45
  MSE: 3074.7025
The polynomial fitting with a degree of 2 fits better than the former ones, but still not
good enough, as the R2, and adjusted R2 were about 0.75.
Goodness of fit:
   SSE: 2.653e+05
   R-square: 0.7638
   Adjusted R-square: 0.7559
   RMSE: 54.3
   MSE: 2948.49
The polynomial fitting with a degree of 3 fits better than the former ones, kind of good
in fact, as the R2, and adjusted R2 were about 0.76.
3.3displacement exponential
Goodness of fit:
   SSE: 3.114e+05
   R-square: 0.7228
   Adjusted R-square: 0.7197
   RMSE: 58.18
   MSE: 3384.9124
The exponential fitting did not fit well for the data set, as the R2, and adjusted R2 were
about 0.72.
4.1horsepower linear
   Goodness of fit:
     SSE: 1.033e+05
     R-square: 0.4542
     Adjusted R-square: 0.4421
     RMSE: 33.88
     MSE: 1,147.8544
The linear fitting did not fit well for the dataset, as the R2, and adjusted R2 were about
0.45.
4.2horsepower polynomials
Goodness of fit:
   SSE: 6.728e+04
   R-square: 0.6445
   Adjusted R-square: 0.6406
   RMSE: 27.19
   MSE: 739.2961
The polynomial fitting with a degree of 1 did not fit well for the dataset, as the R-square
and Adjusted R-square was only about 0.64.
Goodness of fit:
  SSE: 4.308e+04
  R-square: 0.7724
  Adjusted R-square: 0.7673
  RMSE: 21.88
  MSE: 478.7344
The polynomial fitting with a degree of 2 fitted better than the former ones, but not
good enough, as the R-square was 0.77248 and Adjusted R-square was 0.7673.
Goodness of fit:
  SSE: 4.146e+04
  R-square: 0.7809
  Adjusted R-square: 0.7735
  RMSE: 21.58
  MSE: 465.6964
  The polynomial fitting with a degree of 3 fitted better than the former ones, good but
not good enough, as the R-square was 0.78 and Adjusted R-square was 0.77.
4.3horsepower exponential
Goodness of fit:
   SSE: 5.003e+04
   R-square: 0.7356
   Adjusted R-square: 0.7327
   RMSE: 23.45
   MSE: 549.9025
The exponential fitting did not fit well for the dataset, as the R-square was only 0.7356
and Adjusted R-square was only 0.7327.
5.1weight linear
Goodness of fit:
   SSE: 2.701e+07
   R-square: 0.5482
   Adjusted R-square: 0.5382
   RMSE: 544.8
   MSE: 296807.04
The linear fitting did not fit well for the dataset, as the R-square was only 0.5482 and
Adjusted R-square was only 0.5382.
5.2weight polynomials
Goodness of fit:
   SSE: 1.565e+07
   R-square: 0.7381
   Adjusted R-square: 0.7353
   RMSE: 412.5
   MSE: 170156.25
The polynomial fitting well but not good enough, as the R-square was 0.74 and
Adjusted R-square was 0.73.
Goodness of fit:
  SSE: 1.038e+07
  R-square: 0.8264
  Adjusted R-square: 0.8226
  RMSE: 337.7
  MSE: 114041.29
The polynomial fitting of a degree of 2 fitted well for the dataset, as the R-square was
0.8264and the Adjusted R-square was 0.8226.
Goodness of fit:
  SSE: 1.029e+07
  R-square: 0.8278
  Adjusted R-square: 0.8221
  RMSE: 338.1
  MSE: 114311.61
The polynomial fitting of a degree of 3 fitted well for the dataset, as the R-square was
0.8278 and the Adjusted R-square was 0.8221.
5.3weight exponential
Goodness of fit:
   SSE: 1.246e+07
   R-square: 0.7916
   Adjusted R-square: 0.7893
   RMSE: 368
   MSE: 135424
The exponential fitting fitted good but not good enough for the dataset, as the R-square
was 0.7916 and the Adjusted R-square was 0.7893.
(c).
>> load carsmall
>> subplot(3,2,1);
>> plot(Model_Year,Acceleration,':ok');
>> title('Acceleration-Time')
xlabel('Time(year)')
ylabel('Acceleration')
>> subplot(3,2,2);
>> plot(Model_Year,Cylinders,':ok');
>> title('Cylinders-Time');
>> xlabel('Time(year)');
>> ylabel('Cylinders');
>> subplot(3,2,3);
>> plot(Model_Year,Displacement,':ok');
>> title('Displacement-Time');
>> xlabel('Time(year)');
>> ylabel('Displacement');
>> subplot(3,2,4);
>> plot(Model_Year,Horsepower,':ok');
>> title('Horsepower-Time');
>> xlabel('Time(year)');
>> ylabel('Horsepower');
>> subplot(3,2,5);
>> plot(Model_Year,Weight,':ok');
>> title('Weight-Time');
>> xlabel('Time(year)');
>> ylabel('Weight');
>> subplot(3,2,6);
>> plot(Model_Year,MPG,':ok');
>> title('MPG-Time');
>> xlabel('Time(year)');
>> ylabel('MPG');
Averagely, the Acceleration increased slightly as time goes by; and cylinder changed
slightly with no clear pattern; displacement decreased as time went from 70 to 82;
horsepower decreased as time went from 70 to 82; weight decreased as time went from
70 to 82; MPG increased to better values as time went from 70 to 82.
e.
Suppose that carbig is the population and carsmall is the sample. The carsmall is made
up of a set of observations included in the population carbig, so the descriptive
characteristics of car small can reveal some of the charcateristics of population carbig
to a extent, as it changes with the same tendency with the population, but sometimes its
characteristics can be very different from carbig unavoidably as it didn’t contain all the
obeversations in carbig. So the sample can be regarded as a representative of the
population, but its charateristics can not be seen as the characteristics of the population
directly. There was no major difference in terms of hypothesis testing, but in terms of
fitting, the curves normally fit better for the carsmall data sets, as there are less data
comparing with the carbig data sets. I don’t think the sample data could accurately
represent the whole data, as there are not enough data in the sample data to represent
all the characteristics of the whole data.