0% found this document useful (0 votes)
24 views15 pages

Group Assignment No.1

The document outlines a group assignment on statistical tests and software, detailing five different statistical tests: ANOVA, one-sample T-Test, linear regression, hypothesis testing, and chi-square test. Each test includes its formula, applications, advantages, disadvantages, and associated statistical software. The assignment is conducted by a group of students under the supervision of their teacher, Mam. Zainab Rehman.

Uploaded by

abdulmueed972
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views15 pages

Group Assignment No.1

The document outlines a group assignment on statistical tests and software, detailing five different statistical tests: ANOVA, one-sample T-Test, linear regression, hypothesis testing, and chi-square test. Each test includes its formula, applications, advantages, disadvantages, and associated statistical software. The assignment is conducted by a group of students under the supervision of their teacher, Mam. Zainab Rehman.

Uploaded by

abdulmueed972
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Group Assignment no.

Subject: Quantitative Reasoning 2

Topic: Statistical tests & Softwares

Teacher: Mam.Zainab Rehman

Students Name:
1. Muhammad Mohsin Shafique ( BB2337)
2. Muhmmad Mugheera Qasmi (BB2345)
3. Moazam Khalid (BB2328)
4. Abdul Mueed (BB2376)
5. Khizar Iqbal (BB2327)

Statistical Test 1:

(BB2337)

Analysis of Variance (ANOVA)


ANOVA, which stands for Analysis of Variance, is a statistical test used to analyze the
difference between the means of more than two groups.

Formula:

ANOVA coefficient, F= Mean sum of squares between the groups (MSB)/ Mean squares
of errors (MSE).

Therefore F = MSB/MSE

where,

● Mean squares between groups, MSB = SSB / (k – 1)


● Mean squares of errors, MSE = SSE / (N – k)
● Total degrees of freedom, N – 1= df3
● Degrees of freedom of errors, N – k = df2 here, N is the total number of
observations throughout k groups.
● Degrees of freedom between groups, k – 1= df1, where k is the number of
groups.

Moreover, the ANOVA table below represents its many components:

For the above table, the following represents:

● SSB = sum of squares between groups


● SSE = sum of squares of errors
● X̄j – X̄ = mean of the jth group,
● X- X̄j = overall mean, and nj is the sample size of the jth group.
● X = each data point in the jth group (individual observation)
● N = total number of observations/total sample size,
● and SST = Total sum of squares = SSB + SSE

If the value of F is near about 1, then there is insignificant variance between the means
of the two groups of data set under observation.

Advantages:

● It allows comparisons of more than two group means simultaneously. ANOVA


can compare 3,4,5, and more group means together.
● ANOVA calculates group variances in just one calculation rather than multiple
t-tests. This makes it quicker and easier to perform the analysis.
● It is a robust technique that works well even if assumptions are slightly violated.
Disadvantages:

● Errors in the study design can limit the conclusions drawn from ANOVA.
● The results do not provide information about the direction or magnitude of group
differences.
● It relies on assumptions. Violation of assumptions can reduce the power and
accuracy of ANOVA.

JMP Statistical Software:

Use tools to manage data including reshaping and preparing data for analysis, basic
navigation, creating summary tables, saving and sharing analysis results. Use
interactive tools to teach and learn statistical concepts.

Applications:
1. Advanced Computational Statistics
2. Automation & Scripting
3. Biometrics & Biostatistics
4. Clinical Data Analysis
5. Consumer & Market Research

PROS:

1. Easy to use and user friendly.


2. Allows you easily load your data from database or csv files,etc.
3. Comes with good data manipulation and visualization features.
4. Good Documentation support.

CONS:

1. When a question comes up, it sometimes takes a lot of research to find an


answer.
2. More data sources support can be added.
3. Cost is a bit high as compared to other similar tools.

Data gathered from:

1. TEXT Book.com
2. wallstreetmojo.com
3. capterra.com
4. trustradius.com

Statistical test 2:

(BB2345)

One sample T-Test


The one-sample t-test is a statistical hypothesis test used to determine whether an
unknown population mean is different from a specific value.

Formula:

Where,

X̄= is the sample mean

μ0= is the hypothesized population mean

SE= Estimated standard error

Application:

● The data are continuous.


● The sample data have been randomly sampled from a population.
● There is homogeneity of variance (i.e., the variability of the data in each group is
similar).
● The distribution is approximately normal.

Graph/Table:
Advantages:

● Can handle small sample sizes effectively.


● Individual comparisons: providing insights into differences between specific
groups.
● They are suitable when comparing two groups at a time.

Disadvantages:

● Limited to pairwise comparisons.


● Conducting multiple t-tests without adjusting the significance results in error.
● Multiple t-tests can become cumbersome and time-consuming.

Software: Matlab Statistical Software

MATLAB is a high-performance language for technical computing.

Application:

1. Math and computation


2. Algorithm development
3. Modeling, simulation, and prototyping
4. Data analysis, exploration, and visualization

PROS:

1. Ease of use
2. Predefined functions
3. Graphical user interference

CONS:
1. Execute more slowly than compiled language.
2. Limited functions on free version
3. Limited Scaling.

Reference:

1. jmp.com
2. utexas.edu
3. quora.com
4. javapoint.com

Statistical test 3

(BB2328)

Linear Regression:

Linear regression analysis is used to predict the value of a variable based on the value
of another variable.

Formula:

y=a+bx

Where,

● Where X is the explanatory variable


● Y is the dependent variable.
● The slope of the line is b, and a is the intercept (the value of y when x = 0).

Table: Reference: research gate.net


Application:

● Economics and Finance:Used to study the relationship between inflation and


unemployment.
● Medical and Healthcare: To study the relationship between patient age and
drug usage.
● Sports Analytics: Used to analyze player performance metrics.
● Energy and Utilities: Helps in optimizing energy distribution.

Advantages:

● Effortlessness and Interpretability.


● Predict and understand the relationship between two variables.
● Makes it easier to grasp more complicated processes.
● Simple linear regression helps us find the best line to describe the relationship,
evaluate the model's fit, and make statistical inferences.

Disadvantages:

● Sensitive to outliers and noise.


● Correlation of error terms.
● Collinearity.
● A non-constant variance of the error term.

Software: Origin

Origin is a complete data analysis and graphing tool providing a range of features (peak
analysis, curve fitting, statistics…) to meet the quality requirements and specific needs
of the scientific community.

Pros:

● Easily connect to Excel files to bring in data.


● Very flexible project structure.
● Column and Cell level Formula for easy calculations that automatically update on
data change

Cons:

● Curve fitting and correlation analysis is difficult.


● Converting photos in data seems to be possible
● No customizing macros or process storage available.

Reference:

● Stat.yale.edu
● 360Digitmg
● SCISPACE
● Towardsdatascience.com
● Originlab.com
● Softwareadvice.com

Statistical Test 4
(BB2376)

Hypothesis testing test:

Hypothesis testing provides a way to verify whether the results of an experiment are
valid.

Formula:

where,

● x ¯ is the sample mean


● μ is the population mean
● σ is the population standard deviation
● n is the sample size.

TABLE:
Application:

● Used in medical centre for finding assumptions.

● Hypothesis testing is a technique that is used to verify whether the results of

an experiment are statistically significant in economics.

● It involves the setting up of a null hypothesis and an alternate hypothesis.

Advantages:
● Hypothesis testing is the only valid method to prove that something “is or is not”

● allows you to evaluate the strength of your claim.

● Guides the research process.

Disadvantages:
● May over-simplify the problem.

● May not meet the client's needs.


● May not be well-suited to application.

Software: XLSTAT

XLSTAT is a powerful yet flexible Excel data analysis and statistic tool.

Pros:

● Easy to tackle complex data analysis tasks.


● Due to collaboration with excel its easy to transfer data back and forth.
● Users of all levels of expertise can quickly develop insights from their data.

Cons:

● The software can be a bit expensive, particularly for small businesses with limited
budgets.
● Compatibility issues with certain versions of Excel.
● The user interface can be a bit overwhelming for first-time users.

Reference:

● cutemath.com
● cutemath.com
● csulb.edu
● capteera.com

Statistical Test 5 : Chi square test

(BB2327)

A chi-square test is a statistical test used to compare observed results with expected
results.
Table:

Application:

● Goodness of fit of distributions.


● Test of independence of attributes.
● Test of homogeneity.
Advantages of Chi Square Test:

● Can test association between variables.


● Identifies differences between observed and expected values.
● The Chi-square is robust with respect to the distribution of the data.

Disadvantages of Chi square test:

● Can't use percentages


● Data must be numerical
● Categories of 2 are not good to compare
● The number of observations must be 20+

Software: RStudio

RStudio is an integrated development environment for R, a programming language for


statistical computing and graphics.

Pros:

● The invention of R Studio Cloud has made generating reports quick and easy.
● The user can download packages, visualize their code and run it as well as debug
it.
● Easily available, convenient installation,an asset both for analysis &
visualization,free software.Multi functional.
● Allows users to analyze large datasets easily and quickly.

Cons:
● Memory-intensive since objects are stored in physical memory.
● Lacking in security features, cannot be embedded in a web application.
● Rstudio does not provide menu-driven support for data analysis.
● For new developers in R, UI may have some learning curve in comparison to other
IDEs.

Reference:

● southampton.ac.uk
● getrevising.co.uk
● slideshare.net
● getapp.com
● softwareadvice.com

The END

----------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------

You might also like