0% found this document useful (0 votes)

195 views47 pages

Statistical Tests

Basic statistics.

Uploaded by

Uche Nwa Elijah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

195 views47 pages

Statistical Tests

Basic statistics.

Uploaded by

Uche Nwa Elijah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 47

Introduction to statistical testing

Illustrated with XLSTAT

Jean Paul Maalouf
webinar@xlstat.com

Nov. 9, 2016

www.xlstat.com

1
Goal of this
webinar
Let you become
independent in using our
web stat test selection tool
(whether youre an XLSTAT user or not)

Link

2
PLAN

XLSTAT: who are we ?

Statistics: categories
Reminder on Descriptive / exploratory statistics
Statistical tests: principles, steps & practice on XLSTAT
Parametric vs non parametric tests practice on XLSTAT
Tests on independent vs paired samples
Statistical tests: Comparison vs Association
Practice on XLSTAT: Fishers exact test on a contingency table
Appendix: How to interpret p-value > alfa?
All the data in this webinar were made up unless otherwise 3
specified
XLSTAT: Who are
we?
XLSTAT is a user-friendly
statistical add-on software
for Microsoft Excel

4
XLSTAT
A growing software and team

New version,
XLSTAT realizes VBA interface, New products,
its first sale on C++ computations, new website,
the Internet 7 languages growing and
1993 2000 2009 dynamic team 2016

Thierry Fahmy The company New offers XLSTAT 365

develops a user- 1996 Addinsoft is 2006 adapted to 2015 Cloud version of
friendly solution created business needs XLSTAT for Excel
for data analysis: 365
XLSTAT is born XLSTAT Free
Free limited
Edition

5
XLSTAT in a few numbers

200+ statistical features 50k users

General or field-oriented solutions Across the world. Companies, education, research

16 employees 130k visits/month on the website

Always receptive to the needs of users Easy tutorials available in 5 languages

7 languages 400 downloads/day

6
Statistics: 4
categories

7
Statistics: 4 categories
Recording Recording
Nov. 30

Description Exploration Tests Modeling

I want to summarize I want to easily extract I want to accept / I want to understand

small data sets (1-3 information from a reject a very precise the way a phenomenon
variables) using large data set hypothesis assuming evolves according to a
simple statistics or without necessarily error risks. (t tests, set of parameters.
charts (mean, having a precise ANOVA, correlation (regression, ANOVA,
standard deviation, question to answer. tests, chi-square...) ANCOVA...)
8
boxplots...) (PCA, AHC...)
Reminder on
Descriptive /
exploratory
statistics

9
Data set: online shoe selling platform

Variables
Individuals

10
Toward exploratory data analysis: scatter plot
colored by group

- Invoice amount decreases with time spent

on the website.
- Plutonians spend more money on the website
compared to others.
- Martians and humans form a relatively
homogeneous group
- ...

Webinar Recording

11
The same kind of reasoning on a higher
number of variables... Exploratory statistics
(or Exploratory Data Analysis)

12
Principal Component Analysis
Chart 1: correlation circle ; chart 2: observations

Weight-
Height- Weight+
time on site+ Height+
time on site-

13
Webinar Recording
PCA: explorations ...

Weight increases with height Shoe size is unrelated to weight / height

Time spent on site decreases with weight & height Derrick has big feet. Shaun has small feet.

Looks like there are two clusters in the data And so on...

14
Data exploration inspired us many hypotheses. Are they valid?
Statistical tests

15
Statistical tests
I want to accept / reject a very precise
hypothesis assuming error risks.
Statistical tests usually answer yes/no
questions

16
Statistical testing: steps
Writing up the question (answer: yes/no)

Writing up the null & the alternative hypotheses

Choosing the appropriate statistical test & the alfa risk threshold (check out the guide online)

Gathering the data

Things will be added here later

Running the test

Answering the question: if p-value < alfa, we reject H0 with a risk proportional to p-value of being wrong

17
Step 1: writing up
Question: do fertilizers A & B the question
induce a difference in sugar
rate in bananas?

18
Step 2: Writing
up the null & the
H0 alternative
VS hypotheses
Ha

19
Writing up hypotheses

Question
? Do fertilizers A & B induce a difference in sugar rate
in bananas?

Null Hypothesis
H0 Generally implies an idea of equality
H0: mean sugar rate in A-fertilized bananas = mean sugar rate in B-fertilized
bananas

Alternative Hypothesis
Ha Generally implies an idea of difference
Ha: mean sugar rate in A-fertilized bananas mean sugar rate in B-fertilized
bananas

20
Statistical testing: steps where are we?
Writing up the question (answer: yes/no)

Writing up the null & the alternative hypotheses

Choosing the appropriate statistical test & the alfa risk threshold (check out the guide online)

Gathering the data

Things will be added here later

Running the test

Answering the question: if p-value < alfa, we reject H0 with a risk proportional to p-value of being wrong

21
Are we comparing means?
If yes, how many?
Step 3a:
Are we comparing proportions?
If yes, how many? choosing the
Are we comparing variances?
If yes, how many?
appropriate
Are we testing associations?
...
statistical test

In our case, we want to compare 2 means

Students t-test for two independent samples

Link: choosing the appropriate statistical

test according to your situation
22
The alfa risk threshold (0<alfa<1)
is the threshold below which we
decide to reject H0
Step 3b:
The more we want to limit the risk choosing the alfa
of taking a wrong decision, the
more we should decrease alfa risk threshold
People often set alfa at 0.05. This
is not a reason to do it
systematically
(but this is what well do in our example )

23
Experiment: 60 banana trees are
planted; 30 of them receive fertilizer A,
30 of them receive fertilizer B

Step 4: gathering
the data

24
Step 5: running
the test in
XLSTAT

25
Step 6:
interpreting the
p-value result and
VS answering the
alfa
question

26
Interpreting the result

Question The test computes a

? Do fertilizers A & B induce a difference in sugar rate number called p-value.
in bananas? 0 < p-value < 1

The p-value is the risk you take

Null Hypothesis of being wrong when rejecting
H0 Generally implies an idea of equality H0 and accepting Ha
H0: mean sugar rate in A-fertilized bananas = mean sugar rate in B-fertilized
bananas
Decision : If p-value < alfa, we
Alternative Hypothesis reject H0 and accept Ha
Ha Generally implies an idea of difference assuming a risk proportional to p-
Ha: mean sugar rate in A-fertilized bananas mean sugar rate in B-fertilized value of being wrong.
bananas

27
Interpreting the result

Decision: p-value < alfa. We reject H0 & accept Ha with a very low risk of being wrong.

Answer: The two means (fertilizer A vs fertilizer B) are significantly different

28
Parametric vs non
Power
parametric tests
VS
Robustness

29
Parametric vs non parametric tests
Differences on the way they work

A statistical test can be either parametric or non parametric

Parametric tests are reliable only under certain conditions that are linked
to the distribution of populations. These conditions can be found on our
online statistical testing guide.
Non parametric tests do not assume any underlying distribution. Most of
them are computed from the ranks of the data.

30
So why do we still use parametric tests?
Differences on their usefulness

Non parametric tests: reliable in a larger number of situations than

parametric tests they are more robust.
Parametric tests: more able to reject H0 if it is false, and if applicability
conditions are respected they are more powerful*. *Statistical power of a test is
its ability to lead to a rejection
of H0 if H0 is wrong
So, which type should you choose? Heres a proposition:
Choose an appropriate parametric test

Gather the data

Are assumptions for the parametric test met?

Yes No

Replace with a non parametric test, less powerful but more robust

Run the test

31
Tests on
independent vs
paired samples

32
Tests on independent vs paired samples

Independent samples
Two or more distinct populations
Examples : compare a treated group and a control group; compare
females and males; compare treated and untreated banana trees.

Paired samples
One single population
Examples : measuring the weight of patients before/after a treatment ;
follow up companies or surveyed individuals at different dates ; follow
photosynthetic capacities of the same banana trees at different dates/

33
Statistical tests:
comparison vs
association

34
Statistical tests: comparison & association

Comparison tests
Comparing means (Student / ANOVA)
Comparing variances (Fisher / Levene)
Comparing proportions (tests on proportions)

Variables association tests

Test the association between two qualitative variables (chi-square
& exact Fishers test)
Test the association between two quantitative variables (Pearson &
Spearman correlation coefficients)

35
Commonly used statistical tests
Parametric tests and their non parametric equivalents

Independent / paired Non parametric

Question Parametric tests
samples equivalents
Student's t-test on Mann-Whitney's
Independent
independent samples test
Compare 2 means
Student's t-test on paired
Paired Wilcoxon's test
samples
Independent ANOVA Krukal-Wallis test
Compare k means Repeated measures
Paired Friedman's test
ANOVA
Compare 2 variances Fisher's test
Independent
Compare k variances Levene's test
Independent Chi2 test Fisher's exact test
Association (qualitative var.)
Paired Cochran's Q test
Spearman
Association (quantitative var.) Independent Pearson correlation
correlation

Link: choose an appropriate test according to your situation

36
Association tests:
Fishers exact test
on two qualitative
variables
Investigating the significance of a
contingency table ( = crosstab)

37
Application: association test (qualitative
variables)
EXAMPLE: car garage, customer satisfaction survey

38
Launching the test in XLSTAT
EXAMPLE: car garage, customer satisfaction survey

39
Association test example
EXAMPLE: car garage, customer satisfaction survey

p-value > alfa. We cannot reject H0.

H0: proportions of categories a & b do not change according to categories no-yes-dk

Ha: proportions of categories a & b change according to categories no-yes-dk

40
Statistical tests: revisiting the steps, a conclusion
Writing up the question (answer: yes/no)

Writing up the null & the alternative hypotheses

Choosing the appropriate statistical test (comparison / association) & the alfa risk threshold

Gathering the data

Are assumptions for the parametric test met?

Yes No

Replacing with a non parametric test, less powerful but more robust

Running the test

Answering the question: if p-value < alfa, we reject H0 with a risk proportional to p-value of being wrong

41
Statistics: 4 categories
Recording Recording
Nov. 30

Description Exploration Tests Modeling

I want to summarize I want to easily extract I want to accept / I want to understand

small data sets (1-3 information from a reject a very precise the way a phenomenon
variables) using large data set hypothesis assuming evolves according to a
simple statistics or without necessarily error risks. (t tests, set of parameters.
charts (mean, having a precise ANOVA, correlation (regression, ANOVA,
standard deviation, question to answer. tests, chi-square...) ANCOVA...)
42
boxplots...) (PCA, AHC...)
Future webinars
Nov. 30, 2016: statistical modeling (click here)

43
Thanks for attending!
All the tools we saw are available in all XLSTAT solutions (except XLSTAT-Free)

Survey time

44
Appendix: How
to interpret p >
alfa?

45
Appendix: How to interpret p > alfa?

If p-value < threshold (often 0.05), we reject H0 and accept Ha with a risk
proportional to p-value of being wrong.

If p-value > threshold, there are two possibilities:

If Statistical power* is high (>0.95)
We accept H0 and reject Ha by taking another risk (Bta = 1 - Power) of being
wrong.
If Statistical power is low (<0.95)
The risk of being wrong when accepting H0 is too high (power is low)
The risk of being wrong when rejecting H0 is too high (p-value is high)
We are unable to take any decision. Game over.

*(statistical power being the ability of an experiment/a test to make you reject H0
when it is false)

46
Statistical power: how to increase it

Statistical power increases with:

The number of measurements
Measurement precision
Size effect
The alfa threshold
The statistical test used

Detecting Data Outliers Guide
No ratings yet
Detecting Data Outliers Guide
7 pages
Exploratory Data Analysis - Komorowski PDF
No ratings yet
Exploratory Data Analysis - Komorowski PDF
20 pages
IGNOU PG Diploma in Applied Statistics Assignment Booklet 2020
No ratings yet
IGNOU PG Diploma in Applied Statistics Assignment Booklet 2020
30 pages
Data Forecast - What If Analysis - Few More Examples
No ratings yet
Data Forecast - What If Analysis - Few More Examples
20 pages
Chapter 9 Fundamental of Hypothesis Testing
No ratings yet
Chapter 9 Fundamental of Hypothesis Testing
26 pages
Some Exercises Using Minitab
No ratings yet
Some Exercises Using Minitab
20 pages
Assignment Booklet PGDAST Jan-Dec 2018
No ratings yet
Assignment Booklet PGDAST Jan-Dec 2018
35 pages
Descreptive Statistics 1
No ratings yet
Descreptive Statistics 1
74 pages
Introduction To Spreadsheet Modeling - Winston Albright
No ratings yet
Introduction To Spreadsheet Modeling - Winston Albright
46 pages
Two-Stage Sampling Explained
No ratings yet
Two-Stage Sampling Explained
21 pages
How To Use All 3 Types of ANOVA Built Into Excel To Make Your Internet Marketing More Effective
No ratings yet
How To Use All 3 Types of ANOVA Built Into Excel To Make Your Internet Marketing More Effective
20 pages
Navidi ch6
No ratings yet
Navidi ch6
82 pages
Elements of Mathematical Statistics
No ratings yet
Elements of Mathematical Statistics
23 pages
PSCV Unit-Iii Digital Notes
No ratings yet
PSCV Unit-Iii Digital Notes
46 pages
Basic Statistics and Probability
No ratings yet
Basic Statistics and Probability
49 pages
Minitab Introduction
No ratings yet
Minitab Introduction
86 pages
2.1 Descriptive Statistics Contd
No ratings yet
2.1 Descriptive Statistics Contd
20 pages
Statistical Software for Analysts
No ratings yet
Statistical Software for Analysts
5 pages
Module 1 Notes
100% (1)
Module 1 Notes
73 pages
Multiple Regression Analysis - Inference
No ratings yet
Multiple Regression Analysis - Inference
34 pages
Top R Data Visualizations Guide
No ratings yet
Top R Data Visualizations Guide
48 pages
Inferential Statistics
100% (1)
Inferential Statistics
176 pages
Multiple Regression
No ratings yet
Multiple Regression
20 pages
2003 Makipaa 1
No ratings yet
2003 Makipaa 1
15 pages
PDF
No ratings yet
PDF
114 pages
Hypothesis Testing - Analysis of Variance (ANOVA)
No ratings yet
Hypothesis Testing - Analysis of Variance (ANOVA)
14 pages
Parametric Versus Non Parametric Statistics
No ratings yet
Parametric Versus Non Parametric Statistics
19 pages
OUTLIERS
100% (1)
OUTLIERS
5 pages
Seven QC Tools Tool #5: Part 1-Run Chart
No ratings yet
Seven QC Tools Tool #5: Part 1-Run Chart
6 pages
Time Series Analysis in The Toolbar of Minitab's Help
No ratings yet
Time Series Analysis in The Toolbar of Minitab's Help
30 pages
Interpret The Key Results For Attribute Agreement Analysis
100% (1)
Interpret The Key Results For Attribute Agreement Analysis
28 pages
2 - Introduction To Statistics
No ratings yet
2 - Introduction To Statistics
97 pages
A Short Course of Time-Series Analysis and Forecasting by D S G Pollock
No ratings yet
A Short Course of Time-Series Analysis and Forecasting by D S G Pollock
133 pages
Binomial Logistic Regression Using SPSS
No ratings yet
Binomial Logistic Regression Using SPSS
11 pages
Data Science 03 - Regression PDF
No ratings yet
Data Science 03 - Regression PDF
32 pages
Bayesian A/B Testing for Business
No ratings yet
Bayesian A/B Testing for Business
8 pages
Basic Business Statistics: Introduction and Data Collection
No ratings yet
Basic Business Statistics: Introduction and Data Collection
33 pages
Topic:use Statistical Data Analysis To Drive Fact - Based Decisions
0% (1)
Topic:use Statistical Data Analysis To Drive Fact - Based Decisions
11 pages
Linear Regression Guide for Analysts
No ratings yet
Linear Regression Guide for Analysts
46 pages
Business Stats for Beginners
No ratings yet
Business Stats for Beginners
20 pages
Introduction to Statistics for Engineers
No ratings yet
Introduction to Statistics for Engineers
127 pages
Minitab Workbook
No ratings yet
Minitab Workbook
28 pages
Basic Stats and Probability
100% (1)
Basic Stats and Probability
703 pages
Session 1 (The Nature of Probability and Statistics) PDF
No ratings yet
Session 1 (The Nature of Probability and Statistics) PDF
173 pages
Advanced Statistical Distributions
No ratings yet
Advanced Statistical Distributions
13 pages
Crime Analysis
No ratings yet
Crime Analysis
13 pages
Excel Data Operations & Graphs
No ratings yet
Excel Data Operations & Graphs
30 pages
IB Biology IA Statistical Analysis Guide
No ratings yet
IB Biology IA Statistical Analysis Guide
20 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
11 pages
0001-MBA-Business Statistics-Week 12 Choosing Right Statistical Test-10!06!2023-SV1
No ratings yet
0001-MBA-Business Statistics-Week 12 Choosing Right Statistical Test-10!06!2023-SV1
97 pages
Research Methods Unit 4
No ratings yet
Research Methods Unit 4
6 pages
Parametric and Non-Parametric Statistical Testing
No ratings yet
Parametric and Non-Parametric Statistical Testing
19 pages
Parametric Tests
No ratings yet
Parametric Tests
69 pages
Inferenatial Assign, of Iqra Sajid
No ratings yet
Inferenatial Assign, of Iqra Sajid
8 pages
IB372 FA10 Lab01 Intro Statistics Presentation
100% (1)
IB372 FA10 Lab01 Intro Statistics Presentation
75 pages
9.2 Hypothesis Testing
No ratings yet
9.2 Hypothesis Testing
4 pages
5 Single Sample T JASP
No ratings yet
5 Single Sample T JASP
10 pages
Kruskal-Wallis Test Guide
No ratings yet
Kruskal-Wallis Test Guide
9 pages
Data Analysis and Hypothesis Testing
No ratings yet
Data Analysis and Hypothesis Testing
20 pages
001 MBA Business Statistics Week 14 Parametric and Non Parametric Tests 17-06-2023 F1 SV
No ratings yet
001 MBA Business Statistics Week 14 Parametric and Non Parametric Tests 17-06-2023 F1 SV
109 pages
Aquino Marquez Villegas Research Paper 1
No ratings yet
Aquino Marquez Villegas Research Paper 1
44 pages
The Interplay of Turkish EFL Teachers' Academic Optimism, Psychological Well-Being, and Self-Efficacy
No ratings yet
The Interplay of Turkish EFL Teachers' Academic Optimism, Psychological Well-Being, and Self-Efficacy
7 pages
Sme Group 21
No ratings yet
Sme Group 21
9 pages
Central Tendency Vs Dispersion and Parametric and Non
No ratings yet
Central Tendency Vs Dispersion and Parametric and Non
6 pages
Islamic University of Riau
No ratings yet
Islamic University of Riau
19 pages
Impact of Online Quizzes on Learning
No ratings yet
Impact of Online Quizzes on Learning
7 pages
Unit 1: Essence of Biostatistics: CS4220: Knowledge Discovery Methods For Bioinformatics
No ratings yet
Unit 1: Essence of Biostatistics: CS4220: Knowledge Discovery Methods For Bioinformatics
114 pages
Combination of Concept Maps An
No ratings yet
Combination of Concept Maps An
9 pages
Gender & Student Status Analysis
No ratings yet
Gender & Student Status Analysis
4 pages
Hasil Uji Paired T Test
No ratings yet
Hasil Uji Paired T Test
2 pages
Kalu Janet Samuel Project Work
No ratings yet
Kalu Janet Samuel Project Work
76 pages
A Comparison Between Deductive and Inductive Appro
100% (1)
A Comparison Between Deductive and Inductive Appro
13 pages
TOPIC 8 - T - TEST Latest
No ratings yet
TOPIC 8 - T - TEST Latest
80 pages
Stat Quarter 2 Mod 5
No ratings yet
Stat Quarter 2 Mod 5
5 pages
Chapter 4
No ratings yet
Chapter 4
32 pages
Atomic Structure Learning Model
No ratings yet
Atomic Structure Learning Model
22 pages
Chapter 3636
No ratings yet
Chapter 3636
32 pages
6179-0 Suppl Text
No ratings yet
6179-0 Suppl Text
183 pages
Worksheet 2
No ratings yet
Worksheet 2
6 pages
Ms JAMMR 52002
No ratings yet
Ms JAMMR 52002
18 pages
Quantitative Tools and Techniques: Public Policy and Business
No ratings yet
Quantitative Tools and Techniques: Public Policy and Business
8 pages
ANOVA
No ratings yet
ANOVA
4 pages
Case Processing Summary
No ratings yet
Case Processing Summary
3 pages
Effect of Entrepreneurial Courses PDF
No ratings yet
Effect of Entrepreneurial Courses PDF
23 pages
T Test
No ratings yet
T Test
5 pages
Presentation Anova
No ratings yet
Presentation Anova
122 pages
Nonparametric Tests for Students
No ratings yet
Nonparametric Tests for Students
11 pages
Measurement System Analysis Lab
100% (1)
Measurement System Analysis Lab
32 pages
Statistics For Anthropology 2nd Edition Lorena Madrigal Instant Download
100% (1)
Statistics For Anthropology 2nd Edition Lorena Madrigal Instant Download
64 pages
5 ASAP Advanced Statistics - ANOVA - Total
No ratings yet
5 ASAP Advanced Statistics - ANOVA - Total
127 pages

Statistical Tests

Uploaded by

Statistical Tests

Uploaded by

Introduction to statistical testing

Illustrated with XLSTAT

XLSTAT: who are we ?

Thierry Fahmy The company New offers XLSTAT 365

200+ statistical features 50k users

16 employees 130k visits/month on the website

7 languages 400 downloads/day

Description Exploration Tests Modeling

I want to summarize I want to easily extract I want to accept / I want to understand

- Invoice amount decreases with time spent

Weight increases with height Shoe size is unrelated to weight / height

Writing up the null & the alternative hypotheses

Gathering the data

Things will be added here later

Running the test

Writing up the null & the alternative hypotheses

Gathering the data

Things will be added here later

Running the test

In our case, we want to compare 2 means

Link: choosing the appropriate statistical

Question The test computes a

The p-value is the risk you take

Answer: The two means (fertilizer A vs fertilizer B) are significantly different

A statistical test can be either parametric or non parametric

Non parametric tests: reliable in a larger number of situations than

Gather the data

Are assumptions for the parametric test met?

Run the test

Variables association tests

Independent / paired Non parametric

Link: choose an appropriate test according to your situation

p-value > alfa. We cannot reject H0.

H0: proportions of categories a & b do not change according to categories no-yes-dk

Writing up the null & the alternative hypotheses

Gathering the data

Are assumptions for the parametric test met?

Running the test

Description Exploration Tests Modeling

I want to summarize I want to easily extract I want to accept / I want to understand

If p-value > threshold, there are two possibilities:

Statistical power increases with:

You might also like