Group Assignment no.
Subject: Quantitative Reasoning 2
Topic: Statistical tests & Softwares
Teacher: Mam.Zainab Rehman
Students Name:
1. Muhammad Mohsin Shafique ( BB2337)
2. Muhmmad Mugheera Qasmi (BB2345)
3. Moazam Khalid (BB2328)
4. Abdul Mueed (BB2376)
5. Khizar Iqbal (BB2327)
Statistical Test 1:
(BB2337)
Analysis of Variance (ANOVA)
ANOVA, which stands for Analysis of Variance, is a statistical test used to analyze the
difference between the means of more than two groups.
Formula:
ANOVA coefficient, F= Mean sum of squares between the groups (MSB)/ Mean squares
of errors (MSE).
Therefore F = MSB/MSE
where,
● Mean squares between groups, MSB = SSB / (k – 1)
● Mean squares of errors, MSE = SSE / (N – k)
● Total degrees of freedom, N – 1= df3
● Degrees of freedom of errors, N – k = df2 here, N is the total number of
observations throughout k groups.
● Degrees of freedom between groups, k – 1= df1, where k is the number of
groups.
Moreover, the ANOVA table below represents its many components:
For the above table, the following represents:
● SSB = sum of squares between groups
● SSE = sum of squares of errors
● X̄j – X̄ = mean of the jth group,
● X- X̄j = overall mean, and nj is the sample size of the jth group.
● X = each data point in the jth group (individual observation)
● N = total number of observations/total sample size,
● and SST = Total sum of squares = SSB + SSE
If the value of F is near about 1, then there is insignificant variance between the means
of the two groups of data set under observation.
Advantages:
● It allows comparisons of more than two group means simultaneously. ANOVA
can compare 3,4,5, and more group means together.
● ANOVA calculates group variances in just one calculation rather than multiple
t-tests. This makes it quicker and easier to perform the analysis.
● It is a robust technique that works well even if assumptions are slightly violated.
Disadvantages:
● Errors in the study design can limit the conclusions drawn from ANOVA.
● The results do not provide information about the direction or magnitude of group
differences.
● It relies on assumptions. Violation of assumptions can reduce the power and
accuracy of ANOVA.
JMP Statistical Software:
Use tools to manage data including reshaping and preparing data for analysis, basic
navigation, creating summary tables, saving and sharing analysis results. Use
interactive tools to teach and learn statistical concepts.
Applications:
1. Advanced Computational Statistics
2. Automation & Scripting
3. Biometrics & Biostatistics
4. Clinical Data Analysis
5. Consumer & Market Research
PROS:
1. Easy to use and user friendly.
2. Allows you easily load your data from database or csv files,etc.
3. Comes with good data manipulation and visualization features.
4. Good Documentation support.
CONS:
1. When a question comes up, it sometimes takes a lot of research to find an
answer.
2. More data sources support can be added.
3. Cost is a bit high as compared to other similar tools.
Data gathered from:
1. TEXT Book.com
2. wallstreetmojo.com
3. capterra.com
4. trustradius.com
Statistical test 2:
(BB2345)
One sample T-Test
The one-sample t-test is a statistical hypothesis test used to determine whether an
unknown population mean is different from a specific value.
Formula:
Where,
X̄= is the sample mean
μ0= is the hypothesized population mean
SE= Estimated standard error
Application:
● The data are continuous.
● The sample data have been randomly sampled from a population.
● There is homogeneity of variance (i.e., the variability of the data in each group is
similar).
● The distribution is approximately normal.
Graph/Table:
Advantages:
● Can handle small sample sizes effectively.
● Individual comparisons: providing insights into differences between specific
groups.
● They are suitable when comparing two groups at a time.
Disadvantages:
● Limited to pairwise comparisons.
● Conducting multiple t-tests without adjusting the significance results in error.
● Multiple t-tests can become cumbersome and time-consuming.
Software: Matlab Statistical Software
MATLAB is a high-performance language for technical computing.
Application:
1. Math and computation
2. Algorithm development
3. Modeling, simulation, and prototyping
4. Data analysis, exploration, and visualization
PROS:
1. Ease of use
2. Predefined functions
3. Graphical user interference
CONS:
1. Execute more slowly than compiled language.
2. Limited functions on free version
3. Limited Scaling.
Reference:
1. jmp.com
2. utexas.edu
3. quora.com
4. javapoint.com
Statistical test 3
(BB2328)
Linear Regression:
Linear regression analysis is used to predict the value of a variable based on the value
of another variable.
Formula:
y=a+bx
Where,
● Where X is the explanatory variable
● Y is the dependent variable.
● The slope of the line is b, and a is the intercept (the value of y when x = 0).
Table: Reference: research gate.net
Application:
● Economics and Finance:Used to study the relationship between inflation and
unemployment.
● Medical and Healthcare: To study the relationship between patient age and
drug usage.
● Sports Analytics: Used to analyze player performance metrics.
● Energy and Utilities: Helps in optimizing energy distribution.
Advantages:
● Effortlessness and Interpretability.
● Predict and understand the relationship between two variables.
● Makes it easier to grasp more complicated processes.
● Simple linear regression helps us find the best line to describe the relationship,
evaluate the model's fit, and make statistical inferences.
Disadvantages:
● Sensitive to outliers and noise.
● Correlation of error terms.
● Collinearity.
● A non-constant variance of the error term.
Software: Origin
Origin is a complete data analysis and graphing tool providing a range of features (peak
analysis, curve fitting, statistics…) to meet the quality requirements and specific needs
of the scientific community.
Pros:
● Easily connect to Excel files to bring in data.
● Very flexible project structure.
● Column and Cell level Formula for easy calculations that automatically update on
data change
Cons:
● Curve fitting and correlation analysis is difficult.
● Converting photos in data seems to be possible
● No customizing macros or process storage available.
Reference:
● Stat.yale.edu
● 360Digitmg
● SCISPACE
● Towardsdatascience.com
● Originlab.com
● Softwareadvice.com
Statistical Test 4
(BB2376)
Hypothesis testing test:
Hypothesis testing provides a way to verify whether the results of an experiment are
valid.
Formula:
where,
● x ¯ is the sample mean
● μ is the population mean
● σ is the population standard deviation
● n is the sample size.
TABLE:
Application:
● Used in medical centre for finding assumptions.
● Hypothesis testing is a technique that is used to verify whether the results of
an experiment are statistically significant in economics.
● It involves the setting up of a null hypothesis and an alternate hypothesis.
Advantages:
● Hypothesis testing is the only valid method to prove that something “is or is not”
● allows you to evaluate the strength of your claim.
● Guides the research process.
Disadvantages:
● May over-simplify the problem.
● May not meet the client's needs.
● May not be well-suited to application.
Software: XLSTAT
XLSTAT is a powerful yet flexible Excel data analysis and statistic tool.
Pros:
● Easy to tackle complex data analysis tasks.
● Due to collaboration with excel its easy to transfer data back and forth.
● Users of all levels of expertise can quickly develop insights from their data.
Cons:
● The software can be a bit expensive, particularly for small businesses with limited
budgets.
● Compatibility issues with certain versions of Excel.
● The user interface can be a bit overwhelming for first-time users.
Reference:
● cutemath.com
● cutemath.com
● csulb.edu
● capteera.com
Statistical Test 5 : Chi square test
(BB2327)
A chi-square test is a statistical test used to compare observed results with expected
results.
Table:
Application:
● Goodness of fit of distributions.
● Test of independence of attributes.
● Test of homogeneity.
Advantages of Chi Square Test:
● Can test association between variables.
● Identifies differences between observed and expected values.
● The Chi-square is robust with respect to the distribution of the data.
Disadvantages of Chi square test:
● Can't use percentages
● Data must be numerical
● Categories of 2 are not good to compare
● The number of observations must be 20+
Software: RStudio
RStudio is an integrated development environment for R, a programming language for
statistical computing and graphics.
Pros:
● The invention of R Studio Cloud has made generating reports quick and easy.
● The user can download packages, visualize their code and run it as well as debug
it.
● Easily available, convenient installation,an asset both for analysis &
visualization,free software.Multi functional.
● Allows users to analyze large datasets easily and quickly.
Cons:
● Memory-intensive since objects are stored in physical memory.
● Lacking in security features, cannot be embedded in a web application.
● Rstudio does not provide menu-driven support for data analysis.
● For new developers in R, UI may have some learning curve in comparison to other
IDEs.
Reference:
● southampton.ac.uk
● getrevising.co.uk
● slideshare.net
● getapp.com
● softwareadvice.com
The END
----------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------