VIETNAM NATIONAL UNIVERSITY
UNIVERSITY OF LANGUAGES AND INTERNATIONAL STUDIES
             SUBJECT: MANAGERIAL ECONOMICS
             THEME:   IRIS SETOSA SPECIES
                  Author: Nguyen Hong Nhung
Class      : KTTC.SNHU22E6
Lecturer   : Ph.D. Dang Ngoc Sinh
Name            : Nguyen Hong Nhung
Student ID    : 22043319
                        HANOI – 2024
                                                         Table of contents
Contents
1.     ABSTRACT.................................................................................................................................................3
2.     INTRODUCTION........................................................................................................................................3
3.     LITERATURE BACKGROUND.......................................................................................................................4
3.1 Literature Review........................................................................................................................................4
3.2 Literature Question.....................................................................................................................................4
4. CONTENTS....................................................................................................................................................5
       Economic significance of regression coefficients......................................................................................6
       Statistical significance of regression coefficients......................................................................................6
       Multicollinearity.......................................................................................................................................7
       Testing multicollinearity phenomenon.....................................................................................................7
5.     DISCUSSION..............................................................................................................................................7
6. CONCLUSION................................................................................................................................................8
7. REFERENCES.................................................................................................................................................8
                                                                               2
           1.ABSTRACT                        between other floral characteristics,
                                             thereby improving not only the
This paper focuses on the correlation        speed and reliability of taxonomic
of sepal width to that of sepal length       work in botany but also contributing
in Iris setosa, a unique breed               to the general understanding of
characterized by its morphology.             plant structures.
Sepal width (X) and sepal length (Y)
are the two physical measurements
obtained in centimeters that are
most common as far as
distinguishing features within this
species are concerned. Biotic
attributes also show correlation with
these attributes and the knowledge              2.INTRODUCTION
of such correlations is vital to the
study of plant morphology and                The species, known as iris setosa, is
taxonomy.                                    from the well researched genus of
                                             iris, greatly due to the unique
Through regression analysis, this            structural form of the flower. Among
research determines the                      all the body features, Sepal width
quantitative correlation between this        and Sepal length are considered the
specific metric, sepal width, to sepal       most important one taking into
length in Iris setosa. The evidence          consideration the structural
presented here does point to a               differences among the plants and
positive linear relationship which           their relation to growth and
indicates that as the width of the           development. Sepal width is another
sepal expands the length of the              important metric given in
sepal is inclined to also expand. The        centimeters and characterizes the
derived regression equation serves           width of the sepal, while sepal
as a basis for estimating sepal              length shows the length of this
length from sepal width with a               structure. Altogether, these traits
considerable level of confidence.            are used as key reference
                                             benchmarks required for plant
These results have profound
                                             taxonomy and plant body shape
implications not just for botanical
                                             analysis.
study, but for the classification of
plants as well. Through elucidating          Providing an appreciation of how
specific information regarding the           sepal width and length are related is
roles of expansion and insertion in          significant in plant biology since
the plants under study, and by               knowledge of the two helps in
placing these results within the body        uncovering how the characteristics
of knowledge concerning plant                affect and play a role in rendering
morphology, the study benefits               the overall structure of the flower.
those who use morphology as a                By investigating this relationship in
means of species identification.             the context of Iris setosa given that
Besides, it offers the basis for             the growth patterns are relatively
subsequent investigations                    stable this relationship adds onto
concerning potential interactions            our understanding of the
                                         3
morphological properties of this               characteristic purple-blue color,
species as well as helping in                  which is used as a research object
comparative analysis with other                by biologists and data scientists due
species of Iris.                               to its easily identifiable properties
                                               and stable number patterns
In these research, the author wants
                                               According to Smith et al. (1936), the
to determine whether sepal width
                                               Iris data set, including Iris Setosa,
has a linear relationship with sepal
                                               was generated from flower
length in Iris setosa specimens by
                                               morphological measurements,
performing regression test. More
                                               including the length and width of the
precisely, it aims at testing whether
                                               sepals (petals) and sepals (sepals).
these two characteristics are linearly
                                               This dataset has become a standard
related and, if so, to what extent the
                                               in problem analysis, clustering, and
variation in sepal width affects sepal
                                               online datafication.
length. In this way, the study further
advances the knowledge of plant
morphology and taxonomy, of great              B, Biological characteristics of Iris
utility for botanists, horticulturalists       Setosa
and anyone interested in the                   Geometry:
structural behaviour of flowering              Research by Anderson (1935) has
plants.                                        shown that Iris Setosa is clearly
                                               different from species of the same
Furthermore, this study fits within            genus such as Iris Versicolor and Iris
the emerging paradigm of                       Virginica by the following
quantitative approaches in botany              characteristics:
because of the need for quantitative           - Petals are shorter (usually less
techniques for enhancing the                   than 1.9 cm).
                                               - The sepal is narrower, ranging from
identification and classification
                                               2.0–4.0 cm.
precision of species. The emphasis
                                                Geographic distribution:
of this study on Sep setosa                    This species is mainly distributed in
underlines the significance of the             North America and Europe, where it
morphological traits as quantitative           often grows on moist soil or along
indices of plant attributes for                riversides. According to research by
subsequent studies on related                  Mabberley (2008), Iris Setosa is
species and floral characters to build         highly adaptable to changing
                                               environments, which explains its
   3.LITERATURE                                wide distribution. This research
     BACKGROUND                                highlights the significance of sepal
                                               width and sepal length in
3.1 Literature Review                          morphological and taxonomic
                                               studies, particularly for *Iris setosa*.
                                               These traits are critical for
Introduction to Iris Setosa:                   understanding developmental
Iris Setosa, belonging to the Iris             processes, environmental
genus and Iridaceae family, is a               influences, and species
flower with diverse biological and             differentiation. Prior studies have
structural characteristics, often              utilized regression models to explore
growing in the Northern                        correlations between sepal
Hemisphere. This stands out for its            characteristics and other plant
                                           4
traits. Despite advancements in data          across populations of Iris Setosa.
analysis, species-specific studies on         Another critical question is the
*Iris setosa* remain essential to             efficacy of traditional regression
provide precise insights into its             models compared to modern data
morphology and taxonomy..                     analysis tools, such as machine
                                              learning techniques, in identifying
3.2 Literature Question                       associations between traits and
                                              estimating fluctuations within plant
                                              genotypes. By addressing these
It seeks to investigate how sepal             questions, the research aims to
width and sepal length correlate in           contribute new insights into the
Iris Setosa across varying                    morphological and taxonomic
environmental conditions and                  understanding of Iris Setosa, with
genotypic variations. The study also          broader implications for plant
aims to explore the extent to which           morphology studies in angiosperms.
these two morphological traits
influence and predict other plant
features such as petal size, plant
height, and overall floral structure.
Furthermore, it examines the impact
of environmental factors on the
stability of sepal dimensions and
whether these influences differ
4. CONTENTS
4.1. Data
X                                           Y
3.5                                         5.1
3                                           4.9
3.2                                         4.7
3.1                                         4.6
3.6                                         5
3.9                                         5.4
3.4                                         4.6
3.4                                         5
2.9                                         4.4
3.1                                         4.9
3.7                                         5.4
3.4                                         4.8
3                                           4.3
4                                           5.8
4.4                                         5.7
3.9                                         5.4
3.5                                         5.1
                                        5
3.8                                                       5.7
3.8                                                       5.1
3.4                                                       5.4
3.7                                                       5.1
3.6                                                       4.6
3.3                                                       5.1
3.4                                                       4.8
3                                                         5
3.4                                                       5
3.5                                                       5.2
3.4                                                       5.2
3.2                                                       4.7
3.1                                                       4.8
3.4                                                       5.4
4.1                                                       5.2
4.2                                                       5.5
3.1                                                       4.9
3.2                                                       5
3.5                                                       5.5
3.6                                                       4.9
3                                                         4.4
3.4                                                       5.1
3.5                                                       5
2.3                                                       4.5
3.2                                                       4.4
3.5                                                       5
3.8                                                       5.1
3                                                         4.8
3.8                                                       4.6
3.7                                                       5.3
3.3                                                       5
Variables:                                                  Suppose we have a regression
                                                            model:
X represents the sepal width.
                                                            Y = β1 + β2X
Y represents the sepal length.
                                                            Regression model (1) using Stata,
Summarize x y:
                                                            we obtain the following rerult:
  Variable   Obs       Mean   Std. Dev.   Min   Max
         x
         y
              48
              48
                   3.441667
                   5.008333
                              .3802761
                              .3583849
                                          2.3
                                          4.3
                                                4.4
                                                5.8         Table 1: regression model Y = β1 +
The value of X will range from 2.3 to                       β2X
5.8 and 4.3 to 5.8 for Y with
standard deviations of 0.35 shows a                         Dependent variable: Y
fairly high level of point
                                                            Method: Least Squares
concentration
                                                      6
Date:16/11/2024                                                                                     4.3. Confidence interval
                Time:9:14 am                                                                        for regression
Included observation: 48
                                                                                                    coefficients
                                                                                                    The confidence interval for the
       Source         SS           df        MS                  Number of obs
                                                                 F( 1,     46)
                                                                                 =
                                                                                 =
                                                                                           48
                                                                                        44.57
                                                                                                    regression coefficients is given by
        Model
     Residual
                 2.97057974
                 3.06608637
                                    1
                                   46
                                        2.97057974
                                        .066654052
                                                                 Prob > F
                                                                 R-squared
                                                                                 =
                                                                                 =
                                                                                       0.0000
                                                                                       0.4921
                                                                                                    the following formula:
                                                                 Adj R-squared   =     0.4810
                                                                                                         ^β i – t a/ 2 Se( ^β i) < βi < ^β i + t a/ 2
        Total    6.03666611        47   .128439705               Root MSE        =     .25817
                                                                                                                   n−k                            n−k
            y        Coef.    Std. Err.             t    P>|t|      [95% Conf. Interval]
            x
        _cons
                  .6611083
                  2.733019
                              .0990297
                              .3428582
                                                  6.68
                                                  7.97
                                                         0.000
                                                         0.000
                                                                     .461772
                                                                    2.042881
                                                                                     .8604447
                                                                                     3.423157
                                                                                                             Se( ^β i)
                                                                                                         The confidence interval for the
From the above estimation results
we obtain:                                                                                                   intercept is calculated as:
(PRF): E(Y/X1,X2,X3)= β1 + β2X                                                                      ^β 1 - t (n−2) Se( ^β 1 ) < β1 < ^β 1 + t (n−2)Se( ^β 1 )
                                                                                                             a/ 2                             a/ 2
(SRF): Y = 2.73 + 0.66X
                                                                                                     2.73-1.95*0.34 < β1 <
4.2. Analyze regression
                                                                                                    2.73+1.95*0.34
results
Economic significance of regression                                                                  2.067< β1 < 3.393
coefficients
- B1 = 2.73 > 0                                                                                     That means that when other factors
Shows that the best price has a                                                                     remain constant, the Best price is in
constant amount of 2.73                                                                             the range (2.067;3.393)
- B2 = 0.66 > 0
                                                                                                    Similarly we have:
Inverse ratio shows that every 1 unit
                                                                                                    0.48< β2 < 0.83
increases will increase 0.66 unit of
best price                                                                                          That means, other things being
                                                                                                    equal, X1 is in the range (0.48;0.83)
Statistical significance of regression                                                              4.4. Test the
coefficients
                                                                                                    appropriateness of the
Test pair of hypotheses:
                                                                                                    model.
{   β J =0
    βJ ≠ 0
           (j=2)                                                                                    Test pair of hypotheses:
                                                                                                    H0: R2 = 0
Inspection standards: T=
 ^β −β
   J       j
             T ( n−2 )                                                                              H1: R2 ≠ 0
  Sⅇ ( β^ )
                                                                                                               2
                                                                                                          R /(2)
w α =(T :|T |>t
                           45
                                   =1.95)                                                           F¿              F (2 ; 45)
                           0.025
                                                                                                       (1−R 2)/(46)
Tqs2 = 6.68 → Reject H0, accept
H1→ B2 is statistically significant                                                                 Domain rejected Wa = (F: F >
                                                                                                    F0.05(2;45) = 4.977)
                                                                                                7
We have: F = 44.57 ∈ Wa                      dimensions are known to have
                                             positive relationships. In Iris setosa
     Reject H0, accept H1                   these results may mean that the
     Suitable model                         dimensions of sepals are controlled
                                             during growth in a coordinated
R2 = 0.478 shows that the                    manner by genetic and
                                             environmental influences. For
independent variable explains                example, the large sepals would
47.8% of the variation in the                mean stronger support of the sepals,
                                             which, in turns, would enhance the
dependent variable
                                             flower’s harmony and practical use.
4.5. Check the model                         The intercept value of means that,
for defects                                  when the sepal width is equal to
                                             zero, the sepal length would be 2.73
Multicollinearity
                                             respectively. This is actually out of
Testing multicollinearity                    bound for Iris setosa in so far as
phenomenon                                   sepal width is concerned, and hence
So this model is appropriate and             logically not possible, but its
there is no multicollinearity                presence in the context of
phenomenon because there is only 1           regression analysis is significant to
independent variable. We have                cover the overall variation of the
overcome the multicollinearity               different variables.
problem in the original model
                                             As the given data shows, these
   5.DISCUSSION                              conclusions have some implications
                                             for morphological and taxonomical
The findings of this research are a
                                             study of plants. Awareness of
positive linear relationship between
                                             correlation between sepal width and
sepal width and sepal length in the
                                             sepal length is important in species
Iris setosa since the regression
                                             identification since shape is an
equation is Y = 2.73 + 0.66 X. It also
                                             important factor when identifying
indicates that sepal width is directly
                                             species in botany. In addition, this
related to sepal length since the
                                             will offer a better understanding of
variables have tendencies to
                                             how different Iris setosa might exist
proportionally rise as the other
                                             or sustain itself under various
varies. More precisely, with each
                                             environmental conditions with
increase in sepal width by 1 cm,
                                             regard to the physical dimensions
sepal length expands by 0.66 cm.
                                             that could in one way or the other
Concerning the two morphological
                                             be subjected to soil factors such as
characters, the relatively high slope
                                             type and quality, availability of light
coefficient (0.66) confirms the fact
                                             or water ratio.
that there is a relatively strong
positive relationship between them.          However, the present study has its
                                             limitations also. The analysis is also
The positive correlation observed for
                                             limited to a first-order contingent
this characteristic agrees with other
                                             model, which means that it
researches carried out on plants
                                             presupposes that changes in sepal
morphometry where floral
                                             width are paralleled by changes in
                                         8
sepal length in proportion. Although          The study also indicates that sepal
the above assumption accords with             width has some effect on sepal
the data, non-linear relationships            length proving the close relationship
may be present under some                     between or within these attributes.
circumstances or in other species of          This kind of relationship is very
Iris. However, the sepal size may             important especially for researchers
also be affected by other factors not         and botanists because it gives
addressed in this study including             comprehensive data on patterns of
variation in genes, or change of              plant growth, classification type and
environment. Future work could                plant morphological aspects relating
build on this by using multivariate           to different species.
analysis to include other factors
                                              Therefore, the outcomes of this
affecting the dimensions of the
                                              paper dovetail with knowledge
Sepal.
                                              about the morphology of Iris setosa
Thus, the work clearly evidences              and show that regression analysis
that there is a very high correlation         can be valuable in discovering
between sepal width and sepal                 correlations between dimensions of
length in Iris setosa. Therefore, the         the physical characteristics of
studies extend the knowledge of the           plants. The research conducted to
morphology of the species and join            support the current study can be
the global discourse on plant                 taken further and applied to explore
development. Molecular equation               other factors which could help to
following studies are suggested,              have better insight of plant
which may bring further light to              morphology; for instance,
genetic and environmental factors             environmental factors and genetic
that might contribute to the floral           differences..
formation.
                                              7. REFERENCES
6. CONCLUSION
                                              UCI Machine Learning
Y = 2.73 + 0.66X                              Repository - Iris Data Set: This
                                              dataset, originally compiled by
This equation shows that there is
                                              Fisher in 1936, is widely used in
positive linear relationship between
                                              statistics and machine learning for
sepal width and the sepal length. In
                                              classification methods. It includes
other words, increase of the sepal
                                              data on Iris setosa among other iris
width by 1 cm means the increase of
                                              species. Available at:
the sepal length by about 0.66 cm
                                              https://archive.ics.uci.edu/ml/datase
on average. The constant term,
                                              ts/Iris
2.73, is the estimated sepal length
when sepal width is zero, although            Oxford Academic - The Iris Data
such a value is not borne out of              Set: In Search of the Source of
reality since it is an outlier of the         Virginica: This paper discusses the
range of actual biological data of Iris       historical significance and usage of
setosa.                                       the Iris data set in statistical
                                              research. Available at:
                                          9
https://academic.oup.com/jrssig/arti
cle/18/6/26/7038520
International Journal of Scientific
Research in Research Paper.
Mathematical and Statistical
Sciences
https://www.researchgate.net/profile
/Shohal-Hossain-2/publication/
367220930_Classification_of_Iris_Flo
wer_Dataset_using_Different_Algorit
hms/links/
63c7bf9b6fe15d6a572a5497/
Classification-of-Iris-Flower-Dataset-
using-Different-Algorithms.pdf
U.S. Forest Service
Caring for the land and serving
people. Introduce:
https://www.fs.usda.gov/wildflowers/
beauty/iris/Blue_Flag/iris_setosa.sht
ml
Floria of North America.
Introduce the size,cm,..:
http://www.efloras.org/florataxon.as
px?flora_id=1&taxon_id=200028212
                                         10