0% found this document useful (0 votes)

20 views49 pages

Lecture 1

research lectures for data analysis

Uploaded by

maleekajain1399

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views49 pages

Lecture 1

research lectures for data analysis

Uploaded by

maleekajain1399

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 49

BEAM078 Applied Empirical

Accounting and Finance

BEFM022 Quantitative
Research Methods
Module Leader: Dr Anthony Wood Email: a.p.wood@exeter.ac.uk
Workshop Tutor: Dr Wanling Rudkin Email: w.rudkin@exeter.ac.uk
Lecture 1
Introduction and Basic Statistical Concepts
Part 1
Introduction
Introduction

Objectives
• Introduction to various research methodologies and tools
• Statistical methods
• Databases
• STATA software
• Introduction to academic research
• Preparation for dissertation or equivalent
Introduction
Structure
• 15 credits
• 10 x 2-hour lectures
• 10 x Workshops or Help Hours
• Office hours: Fridays 10-12
• Assessment: 100% Assignment
• Re-assessment: 100% resubmission of Assignment (capped at 50%)
Introduction
Introduction

Assignment
• Topic: Bankruptcy Prediction
• Length: 4000 Words
• Type: Individual Assessment
• Data: Part Provided, Part Self-Collection
• Statistical Analysis: STATA
• Writing Style: Academic
• Referencing Style: APA
• The Assignment Brief can be found in full HERE
Introduction

Expectations
• Attendance
• Completion of tasks
• Ask questions
• Planning and preparation
• Completion of assignment by deadline
• Let me know if you are having difficulties (in-person/email/office hours)
Introduction
• Primary Resource – ELE

• ELE is your #1 destination for all aspects of this module

• Lecture Slides
• Workshop Questions
• Academic Papers
• Instructional Videos
• Data and Code
• Quizzes
• Assignment Details
Introduction
Recommended Text (not compulsory)
• Wooldridge – Introduction to Econometrics
• Gujarati – Basic Econometrics
• Baum– An Introduction to Modern Econometrics Using Stata
• Academic papers (introduced throughout the course)
Introduction

• STATA will be used extensively throughout this module.

• It is available for free via the software hub
• Your assignment will be conducted in STATA
• A quick start guide can be found here
• Serial numbers etc will be provided to you via the download.
Part 2
Basic Statistical Concepts
Basic Statistical Concepts
What is/are statistics?
• Statistics – science dealing with the collection, analysis, interpretation, and
presentation of numerical data.

• The practice or science of collecting and analysing numerical data in large

quantities, especially for the purpose of inferring proportions in a whole from
those in a representative sample.

• Statistics is not mathematics! it is a science!

Population
the whole: a collection of persons, objects, companies, or items under consideration in a
statistical study.

Sample
part of the population from which information is collected and analysed.
Basic Statistical Concepts
Usually in a statistics we do not know the true population parameters.

e.g. The ONS states the average man in England is 5ft 9in (175.3cm) tall and
weighs 13.16 stone (83.6kg)

• How do they know this?

• Did they measure every man in England?

By collecting and analysing real data samples we can make our “best guess” as to the true
“answer”.

What is an empirical/statistical study?

1) Using observation-based data to gain knowledge, prove concept, or answer research questions.
2) Capable of being verified or disproved by observation or experiment empirical laws.

This module concerns observation/sample-based data and how it can be used within an
empirical/statistical study.
Measures of Centre

1 𝑛
Arithmetic Mean = 𝑥ҧ = σ 𝑥 , where n is the size of the sample and 𝑥𝑖 is
𝑛 𝑖=1 𝑖
the value of an observation within the sample.

Commonly just termed the mean or average. The arithmetic mean uses the sum of all observations within the
sample. The further an observation is from the mean, the more unusual the observation is. These
unusual/extreme values heavily impact the mean value.

𝑛
Geometric Mean = ∏ = 𝑥1 𝑥2 𝑥3 … 𝑥𝑛 , where n is the size of the sample
The geometric mean is the 𝑛𝑡ℎ root of the product of all observations within the sample. It is frequently used to
average rates of change over time or to compute the growth rate of a variable. Typically negative observations
are removed from the calculation.
Measures of Centre

Sample Mode: Value which occurs most frequently within the sample.

e.g. given the data:

1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5,5,6,6,6,6,6,6,7,7,7,8,8,8,9,9,9,9

The mode value would be 6

A dataset can have more than one mode (bimodal, trimodal etc), or it might not have any mode (all
observations are unique).
Measures of Centre

Sample Median: Numeric value separating the “higher” half of the data from the “lower”
half. i.e. value is midway in the distribution of values (in the middle)

𝑛+1 𝑡ℎ
When the sample size n is odd, the median is equal to the observation
2

5+1 𝑡ℎ
e.g. given the data 2,5,7,11,14, the median equals the = 3rd observation = 7
2

𝑛 𝑡ℎ 𝑛 𝑡ℎ
When the sample size n is even, the median is equal to the average of the and +1 observation
2 2

9+10
e.g. given the data 3,9,10,20, median is equal to the average of the 2nd and 3rd observations = 9.5
2

Note: Equal numbers of observation lie above and below the median. It is not affected by extreme values.
Measures of Location

2-Quantile (median): Divides the observed data into two halves and gives the “half-way
point”.

Quartiles: Diving the observed values into quarters, or 4 equal parts (Q1, Q2, Q3)

Quintiles: Diving the observed values into fifths, or 5 equal parts (QU1, QU2, QU3, QU4)

Deciles: Dividing the observed values into tenths, or 10 equal parts(D1, D2,…,D9)

Percentiles: Diving the observed values of the variable into hundredths, or 100 equal parts (P1, P2,…,P99).

Note that the median is also the 50th percentile, 2nd Quartile, and 5th Decile!!
Measures ofMeasures
Location (Quantiles)
of Location
General rule to find the value of the (interpolated) percentile

1. Locate the position of the Yth percentile within the ranked set of observations
2. Determine the value associated with that position
Location Location
𝑌 1 -18.11% 13 0.61%
Location of Yth percentile = 𝐿𝑌 = 𝑛 + 1
100 2 -17.13% 14 1.06%

25 3 -11.65% 15 1.77%
Given the ordered data to the left, the location of the 25th percentile = 24 + 1 = 6.25
100 4 -9.97% 16 2.82%

5 -8.95% 17 2.93%
In other words the interpolated percentile is a quarter of the way between L6 and L7
6 -8.61% 18 3.14%

Using linear interpolation we can determine its value. 7 -5.10% 19 6.72%

8 -4.75% 20 6.78%
Namely L6 + the fractional portion of the difference between L6 and L7 9 -2.09% 21 7.18%

10 -1.93% 22 7.87%
-8.61 + 0.25(-5.1 - -8.61) = -7.7325
11 -0.87% 23 9.84%

12 -0.64% 24 10.13%
Measures of Variation

Range: The sample range of a variable is the difference between its maximum
and minimum values in the data set:

Range = Max −Min.

The range cannot ever decrease, but can increase, when additional observations are included in the
data

The sample interquartile range of the variable, denoted IQR, is the difference between the first and
third quartiles of the variable, that is,

IQR = Q3 − Q1.

The IQR gives the range of the middle 50% of values.

Measures of Variation

Five Number Summary and Box Plot

Minimum, maximum and quartiles together provide information on centre and variation
of the variable in a nice compact way.

The five-number summary of the variable consists of minimum, maximum, and quartiles written in increasing
order: Min,Q1,Q2,Q3,Max.

A boxplot is based on the five-number summary and can be used to provide

a graphical display of the centre and variation of the observed values of variable in a data set.

Example: Price of Beef (£/100g), 54 observations.

0.11,0.17,0.11,0.15,0.10,0.11,0.21,0.20,0.14,0.14,0.23,0.25,0.07,0.09,0.10,0.10,0.19,0.11,0.19,0.17,0.12,0.12,0
.12,0.10,0.11,0.13,0.10,0.09,0.11,0.15,0.13,0.10,0.18,0.09,0.07,0.08,0.06,0.08,0.05,0.07,0.08,0.08,0.07,0.09,0.
06,0.07,0.08,0.07,0.07,0.07,0.08,0.06,0.07,0.06
Measures of Variation
Box Plot of Price of Beef (£/100g), 54 observations

DATA LINK
STATA code:

graph hbox Price_of_beef, nooutside

Measures of Variation
Box Plot Height and Weight of male and female students

DATA LINK
STATA code:
graph box height, nooutside over(gender)
graph box weight, nooutside over(gender)
Measures of Variation
Sample Variance

Variance is another measure of the spread of numbers within a data set.

To calculate the variance of the data we first square, then sum, the deviation from the mean, of each observation.

This called the sum of squared deviations which provides a measure of the total deviation from the mean for all
the observed values of the variable.

VARIANCE is the average sum of squared deviations. For a sample, the variance is calculated as follows:
Measures of Variation
Standard Deviation

The sample standard deviation is the most frequently used measure of variability
and is simply the square root of the variance.

For a variable x, the sample standard deviation, denoted by 𝑠𝑥 (or when no confusion arise, simply by s), is:

As a general rule of thumb, 95% of the data will lie between the mean ± 2 standard deviations, but more on this later.
Measures of Variation
Why divide by n-1 and not n?

If you ask this question to the internet or ChatGPT there is much discussion and debate.

The simplest answer however is thus:

Dividing by n-1 reduces bias in the standard deviation estimator. Why?

The sample standard deviation estimator (dividing by n-1) is an estimate of the true population standard deviation from
which the sample was drawn. Because the observed values, on average, fall closer to the sample mean than the population
mean, the sample standard deviation will most likely under-estimate the actual population standard deviation. Dividing by
n-1 attempts to correct this.

e.g. Heights.

I want to know the standard deviation of all heights in the UK (the population). If I go about the country recording peoples’
heights, It would be unlikely that I meet and record the very tall, or the very short. The sample would therefore have a
smaller standard deviation than the true population parameter if I simply divided by n. Dividing by n-1 corrects this.
Measures of Variation
Skewness is a measure of how symmetrical the data is about the mean.
If the dispersion of data around the mean is symmetrical, then
there is no skewness. This will only happen when the Mean,
Median and Mode are identical.

A positive skewness occurs when there are more extreme values

in the right-hand tail of the distribution of data.
Mean>Median>Mode

A negative skewness occurs when there are more extreme

values in the left-hand tail of the distribution of data.
Mode> Median> Mean
Measures of Variation
Skewness can be calculated using the following formulae:

𝑛 𝑥 −𝑥ҧ 3
MS Excel calculates skewness as = σ 𝑖
(𝑛−1)(𝑛−2) 𝑠

1 𝑛
σ
𝑛 𝑖=1
𝑥𝑖 −𝑥ҧ 3
STATA calculates skewness as = 1 𝑛 3/2
σ 𝑥𝑖 −𝑥ҧ 2
𝑛 𝑖=1

There are also alternate calculations!!

example spreadsheet HERE

See the excel example spreadsheet HERE which demonstrates these calculations.
Measures of Variation
Kurtosis

Kurtosis is an indicator of the size or spread of the data within the tails of its distribution.

If the data follows a normal distribution then the Kurtosis should be 3 (mesokurtic).

A positive/high kurtosis (>3) indicates that you have lots of data in the tails of your distribution (thinner/taller peak, heavier tails).

A negative/low kurtosis (<3)indicates that you have small amounts of data in the tails of your distribution (wider/lower peak, lighter/no tails).
Measures of Variation

Kurtosis can be calculated using the following formulae:

𝑛(𝑛+1) 𝑥𝑖 −𝑥ҧ 4 3(𝑛−1)2

MS Excel calculates skewness as = σ −
(𝑛−1)(𝑛−2)(𝑛−3) 𝑠 (𝑛−2)(𝑛−3)

1 𝑛
σ 𝑥 −𝑥ҧ 4
𝑛 𝑖=1 𝑖
STATA calculates skewness as = 1 𝑛 2
σ𝑖=1 𝑥𝑖 −𝑥ҧ 2
𝑛

example spreadsheet HERE

See the excel example spreadsheet HERE which demonstrates these calculations.
Summary Statistics
Summary Statistics

An important part of any empirical research

paper is providing key information
(summary statistics) for the data used within
the study.

This allows the reader to instantly visualise

the data and satisfy that there are no errors
or extreme values that may impact the
results of any analysis (more on this later).

Example taken from:

Horton, Tsipouridou, & Wood (2017). European

Market Reaction to Audit Reforms, European
Accounting Review 27(5).
Summary Statistics

Summary Statistics in STATA

DATA LINK

Simple statistics code:

summarize weight height

Summary Statistics

Summary Statistics in STATA

DATA LINK

More detailed statistics:

summarize weight, detail

Summary Statistics
Summary Statistics in STATA
DATA LINK

Statistics can also be generated via tabstat which gives you more control over
the output than the “summarize” command. For example:

tabstat height, by(gender) stats(n mean sd min p25 p50 p75 max)
Distribution
Distributions are a fundamental aspect of any statistical or
empirical analysis.

This part of the lecture will look at:

Discrete probability distributions

• The distribution of a discrete(categorical) variable, where variable x has a
countable number of possible values.

Lecture 2 will focus on:

Continuous probability distributions

• The distribution of a continuous variable x that has a set of possible values
which is infinite and uncountable.
Discrete Probability Distributions

Imagine a hypothetical experiment consisting of a very long sequence

of repeated observations on some random phenomenon.

e.g. flipping a coin or rolling a dice.

Each observation may or may not result in some particular outcome.

The probability of that outcome is defined to be the relative frequency

of its occurrence, in the long run.

The probability of a particular outcome is the proportion of times that

outcome would occur in a long run of repeated observations.
Discrete Probability Distributions
The probability distribution of a discrete random variable x assigns a
probability to each possible value of the variable.

Variable x has a countable number of possible values.

The probability distribution of x lists the values and their probabilities.

The probabilities 𝑃(𝑥𝑖 ) must satisfy two requirements, namely:

(probabilities are bounded between 0 and 1, and the sum of all probabilities is equal to 1)
Discrete Probability Distributions
Example: Dice rolling simulation.

Imagine that I have 2 dice, each dice has 6 sides numbered 1 through 6.
If I roll the dice, I will obtain an integer ranging between 2 and 12 (lets call this S).

We can calculate the probability of each possible outcome S as follows:

(6 ways, 6/36)

(5 ways, 5/36)

(4 ways, 4/36)

(3 ways, 3/36)

(2 ways, 2/36)
This is theoretical distribution –
(1 way, 1/36) what I would expect to see should
I roll the dice enough times.
Discrete Probability Distributions
Example: Dice rolling simulation.
Let’s see what happens when I roll the dice 10 times.

Relative Frequency

NOTE:

Actual frequency of
rolling a 7 = 3.

The relative frequency

is therefore 3/10 = 0.30

Because of the relatively small sample size (n=10), the relative frequencies obtained do not represent the
theoretic probability distribution seen on the previous slide…. so lets increase the sample size.
Discrete Probability Distributions
Example: Dice rolling simulation.
Relative Frequency

Relative Frequency
Relative Frequency As the sample size
increases, the closer we
Relative Frequency

get to the theoretical

discrete probability
distribution.
Histograms
Histograms plot the frequency or counts of discrete variables.
(as seen on the previous slide)

A histogram shows an approximate representation of the distribution of that data.

Let us plot a histogram in STATA using the ages of 102 people.

The data can be found here

and in raw format below

34,67,40,72,37,33,42,62,49,32,52,40,31,19,68,55,57,54,37,32,54,38,20,50,56,48,35,52,29,56,68,65,45,44,54,3
9,29,56,43,42,22,30,26,20,48,29,34,27,40,28,45,21,42,38,29,26,62,35,28,24,44,46,39,29,27,40,22,38,42,39,26
,48,39,25,34,56,31,60,32,24,51,69,28,27,38,56,36,25,46,50,36,58,39,57,55,42,49,38,49,36,48,44
Histograms
Firstly we can tabulate the data in order to view how many people are of a particular age.

Data Link
STATA Code:

tabulate age
Histograms
We can also easily plot the histogram using the histogram command

Data Link
STATA Code:

histogram age

Each bar represents a range (bin) of ages. Here STATA automatically divides the data into 10 bins.
Histograms
We can change the number of bins if we wish.

STATA Code: histogram age, bin(20) STATA Code: histogram age, discrete
(because age is a discrete variable, we can use the option “discrete” which
uses all possible integer values within the data range)
Histograms
We can also use histograms to visualise the distributions of continuous variables.

Continuous variables are generally unbounded (can take any value) and can be to any
number of decimal places.

i.e. the data does conform to any pre-determined probable value like a discrete variable
does.

In order to plot continuous data as a histogram, the data must be placed within a bin.

Similarly to discrete variables, you can tell STATA the number of bins which you require,
and the frequency counts are calculated for you.

Let us look at some share price return data which can be downloaded here

The data consists of 10,000 artificially generated stock returns.

Histograms

STATA Code: histogram stock_return, bin(10) STATA Code: histogram stock_return, bin(40)

STATA Code: histogram stock_return, bin(100) STATA Code: histogram stock_return, bin(1000) STATA Code: histogram stock_return, bin(100) kdensity
Continuous Probability Distributions

The Dice example demonstrated a discrete probability distribution with 11

possible outcomes (integers from 2-12).

We have also seen that a continuous variable can be divided into bins and
we can visualise it probability distribution at various bin sizes.

If the number of bins or class intervals increases, and given enough data, the
shape of the histogram will approach a smooth curve.

An infinite number of class intervals (bins) changes a discrete probability

distribution into a continuous probability distribution.

This will be the focus of Lecture 2

Tasks

• Watch my video on how to get started with STATA.

• Use the examples contained within these slides to familiarise yourself

with generating summary statistics and histograms.

• Complete the “Test Your Knowledge” Mini Quizzes

• Complete Workshop_01 Questions for next week.

Building Block

Assignment Structure

CH1 and CH2 Definitions and Descriptive Statistics
No ratings yet
CH1 and CH2 Definitions and Descriptive Statistics
29 pages
Slides Week2
No ratings yet
Slides Week2
43 pages
Stats
No ratings yet
Stats
109 pages
ch03 Ver3
No ratings yet
ch03 Ver3
25 pages
ch03 Ver3
No ratings yet
ch03 Ver3
25 pages
EECM3724 Unit 1 Ch3 Slides 2022
No ratings yet
EECM3724 Unit 1 Ch3 Slides 2022
48 pages
Newbold SBE9e Accessible CH02
No ratings yet
Newbold SBE9e Accessible CH02
64 pages
Intro to Descriptive Statistics
No ratings yet
Intro to Descriptive Statistics
68 pages
Unit 3 - Descriptive Statistics
No ratings yet
Unit 3 - Descriptive Statistics
44 pages
Central Tendency Variation Outliers
No ratings yet
Central Tendency Variation Outliers
59 pages
Lecture 04
No ratings yet
Lecture 04
88 pages
Measures of Central Tendency
No ratings yet
Measures of Central Tendency
4 pages
Chapter 3, Part A Descriptive Statistics: Numerical Measures
No ratings yet
Chapter 3, Part A Descriptive Statistics: Numerical Measures
7 pages
Measures of Central Tendency
100% (15)
Measures of Central Tendency
15 pages
MS Excel in Data Analytics
No ratings yet
MS Excel in Data Analytics
56 pages
NITKclass 1
No ratings yet
NITKclass 1
50 pages
DSILYTC Session 5 - Descriptive Statistics
No ratings yet
DSILYTC Session 5 - Descriptive Statistics
99 pages
20 - Levels of Measurement, Central Tendency Dispersion
No ratings yet
20 - Levels of Measurement, Central Tendency Dispersion
35 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
7 pages
Chapter 3 (Technical English For Statistics)
No ratings yet
Chapter 3 (Technical English For Statistics)
8 pages
R3.Descriptive Statistics
No ratings yet
R3.Descriptive Statistics
5 pages
Basic of Statistics #5 (!!!)
No ratings yet
Basic of Statistics #5 (!!!)
49 pages
Measures
No ratings yet
Measures
8 pages
STAE Lecture Notes - LU3 - Annotated
No ratings yet
STAE Lecture Notes - LU3 - Annotated
10 pages
Module 3 Descriptive Statistics Numerical Measures
No ratings yet
Module 3 Descriptive Statistics Numerical Measures
28 pages
Dsbda Unit 2
No ratings yet
Dsbda Unit 2
155 pages
STAT241 - Business Statistics (Day 3)
No ratings yet
STAT241 - Business Statistics (Day 3)
32 pages
Lecture - 04 - TP
No ratings yet
Lecture - 04 - TP
126 pages
Lecture 3
No ratings yet
Lecture 3
14 pages
Stats Lecture 1
No ratings yet
Stats Lecture 1
45 pages
An Introduction To Statistics: Keone Hon
100% (2)
An Introduction To Statistics: Keone Hon
14 pages
Module 2 - Exploratory Data Analysis (EDA) : Central Tendency and Variability
No ratings yet
Module 2 - Exploratory Data Analysis (EDA) : Central Tendency and Variability
56 pages
Making Sense of Data Statistic Course
No ratings yet
Making Sense of Data Statistic Course
39 pages
STAE Lecture Notes - LU3
No ratings yet
STAE Lecture Notes - LU3
24 pages
CH02
No ratings yet
CH02
46 pages
Chapter 3, Part A Descriptive Statistics: Numerical Measures
No ratings yet
Chapter 3, Part A Descriptive Statistics: Numerical Measures
41 pages
1 Basics of Stat (Statistics IEM 2-2)
No ratings yet
1 Basics of Stat (Statistics IEM 2-2)
29 pages
Share MBBS - Lecture 4 (1) - 1
No ratings yet
Share MBBS - Lecture 4 (1) - 1
68 pages
Module 3 4 MMW
No ratings yet
Module 3 4 MMW
6 pages
Kinds & Classification of Research: Reported By: Marina G. Servan
No ratings yet
Kinds & Classification of Research: Reported By: Marina G. Servan
52 pages
Ch3-Numerical Measures
No ratings yet
Ch3-Numerical Measures
33 pages
Statistics Basics for Students
No ratings yet
Statistics Basics for Students
46 pages
Statistics & Psychology
No ratings yet
Statistics & Psychology
47 pages
Session 2 Inferential Statistics Slides
100% (1)
Session 2 Inferential Statistics Slides
93 pages
Measusres of Locations
No ratings yet
Measusres of Locations
52 pages
MMW Reviewer
No ratings yet
MMW Reviewer
9 pages
MetNum1 2023 1 Week 10
No ratings yet
MetNum1 2023 1 Week 10
79 pages
B. Biostatistics (Descriptive Statistics)
No ratings yet
B. Biostatistics (Descriptive Statistics)
42 pages
Statistics Midterm Review
No ratings yet
Statistics Midterm Review
21 pages
Lecture Afffasfafa
No ratings yet
Lecture Afffasfafa
29 pages
Session 2
No ratings yet
Session 2
14 pages
Statistics
100% (1)
Statistics
11 pages
1-Descriptive Statistics
No ratings yet
1-Descriptive Statistics
44 pages
1-Descriptive Statistics
No ratings yet
1-Descriptive Statistics
44 pages
$RELC031
No ratings yet
$RELC031
43 pages
Chapter 2
No ratings yet
Chapter 2
46 pages
Lecture 5
No ratings yet
Lecture 5
88 pages
Lecture 3
No ratings yet
Lecture 3
59 pages
Normal & Student's t-Distributions
No ratings yet
Normal & Student's t-Distributions
52 pages
Lecture 4
No ratings yet
Lecture 4
45 pages
Strukturdaten K 2022 GB
No ratings yet
Strukturdaten K 2022 GB
1 page
Java Expression
100% (1)
Java Expression
6 pages
Industrial Training in Pharmacy
No ratings yet
Industrial Training in Pharmacy
37 pages
Reading For Today 2 Teacher
No ratings yet
Reading For Today 2 Teacher
65 pages
Practice Final Exam With Answers
No ratings yet
Practice Final Exam With Answers
16 pages
Tolerancias Mettler
No ratings yet
Tolerancias Mettler
247 pages
Fennelly, Lawrence J. - Perry, Marianna A - CPTED and Traditional Security Countermeasures - 150 Things You Should Know-CRC Press (2018)
No ratings yet
Fennelly, Lawrence J. - Perry, Marianna A - CPTED and Traditional Security Countermeasures - 150 Things You Should Know-CRC Press (2018)
463 pages
Ejemplo de Modelo de Carta Formal en Ingles para Pedir Trabajo para Descargar en Word o PDF
No ratings yet
Ejemplo de Modelo de Carta Formal en Ingles para Pedir Trabajo para Descargar en Word o PDF
4 pages
M3D GFwork
No ratings yet
M3D GFwork
6 pages
Class-12 Slides
No ratings yet
Class-12 Slides
24 pages
Screenshot 2024-11-26 at 17.01.44
No ratings yet
Screenshot 2024-11-26 at 17.01.44
12 pages
Module 7 Leadership Training
No ratings yet
Module 7 Leadership Training
14 pages
Value Education - Class 3
No ratings yet
Value Education - Class 3
2 pages
Notes Ebcu005 Research Proposal-1
No ratings yet
Notes Ebcu005 Research Proposal-1
35 pages
Candidate A: Instructions To Candidates: Task A: Individual Presentation
No ratings yet
Candidate A: Instructions To Candidates: Task A: Individual Presentation
8 pages
Body Language: School of Management Studies
No ratings yet
Body Language: School of Management Studies
18 pages
PFR Ujh Multipurpose Project
100% (1)
PFR Ujh Multipurpose Project
64 pages
PVT Company Data
No ratings yet
PVT Company Data
357 pages
VOL II - Zoning Ordinance
No ratings yet
VOL II - Zoning Ordinance
45 pages
B.Tech Mathematics-II Exam Paper
No ratings yet
B.Tech Mathematics-II Exam Paper
8 pages
By: Ms. Josille Marquez
No ratings yet
By: Ms. Josille Marquez
52 pages
Rate Analysis of Reinforcing Steel
No ratings yet
Rate Analysis of Reinforcing Steel
9 pages
Computer by Examveda PDF
No ratings yet
Computer by Examveda PDF
54 pages
Ethics Integrity Aptitude For Civil Services Examination Includes Fullysolved Papers 201319 Top Now 1st Edition DR Awdhesh Singh Irs Retd Download
100% (1)
Ethics Integrity Aptitude For Civil Services Examination Includes Fullysolved Papers 201319 Top Now 1st Edition DR Awdhesh Singh Irs Retd Download
76 pages
Solutions
No ratings yet
Solutions
4 pages
Syphon Aqueduct Type III
No ratings yet
Syphon Aqueduct Type III
10 pages
L SF 21 Disperbyk 2013 en 1
No ratings yet
L SF 21 Disperbyk 2013 en 1
2 pages
Investigation of Evaporator Performance With and Without Liquid O
No ratings yet
Investigation of Evaporator Performance With and Without Liquid O
11 pages
Soil Compaction Lab Report
No ratings yet
Soil Compaction Lab Report
1 page
Ads&Aa Lab Manual
No ratings yet
Ads&Aa Lab Manual
68 pages

Lecture 1

Uploaded by

Lecture 1

Uploaded by

BEAM078 Applied Empirical

Accounting and Finance

• ELE is your #1 destination for all aspects of this module

• STATA will be used extensively throughout this module.

• The practice or science of collecting and analysing numerical data in large

• Statistics is not mathematics! it is a science!

• How do they know this?

What is an empirical/statistical study?

e.g. given the data:

The mode value would be 6

Using linear interpolation we can determine its value. 7 -5.10% 19 6.72%

Range = Max −Min.

The IQR gives the range of the middle 50% of values.

Five Number Summary and Box Plot

A boxplot is based on the five-number summary and can be used to provide

Example: Price of Beef (£/100g), 54 observations.

graph hbox Price_of_beef, nooutside

Variance is another measure of the spread of numbers within a data set.

The simplest answer however is thus:

Dividing by n-1 reduces bias in the standard deviation estimator. Why?

A positive skewness occurs when there are more extreme values

A negative skewness occurs when there are more extreme

There are also alternate calculations!!

example spreadsheet HERE

Kurtosis can be calculated using the following formulae:

𝑛(𝑛+1) 𝑥𝑖 −𝑥ҧ 4 3(𝑛−1)2

example spreadsheet HERE

An important part of any empirical research

This allows the reader to instantly visualise

Example taken from:

Horton, Tsipouridou, & Wood (2017). European

Summary Statistics in STATA

Simple statistics code:

summarize weight height

Summary Statistics in STATA

More detailed statistics:

summarize weight, detail

This part of the lecture will look at:

Discrete probability distributions

Lecture 2 will focus on:

Continuous probability distributions

Imagine a hypothetical experiment consisting of a very long sequence

e.g. flipping a coin or rolling a dice.

Each observation may or may not result in some particular outcome.

The probability of that outcome is defined to be the relative frequency

The probability of a particular outcome is the proportion of times that

Variable x has a countable number of possible values.

The probabilities 𝑃(𝑥𝑖 ) must satisfy two requirements, namely:

We can calculate the probability of each possible outcome S as follows:

The relative frequency

get to the theoretical

A histogram shows an approximate representation of the distribution of that data.

Let us plot a histogram in STATA using the ages of 102 people.

The data can be found here

and in raw format below

The data consists of 10,000 artificially generated stock returns.

The Dice example demonstrated a discrete probability distribution with 11

An infinite number of class intervals (bins) changes a discrete probability

This will be the focus of Lecture 2

• Watch my video on how to get started with STATA.

• Use the examples contained within these slides to familiarise yourself

• Complete the “Test Your Knowledge” Mini Quizzes

• Complete Workshop_01 Questions for next week.

You might also like