0% found this document useful (0 votes)
5 views45 pages

CH 01 Introduction TQT

Uploaded by

ngia.hann12
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views45 pages

CH 01 Introduction TQT

Uploaded by

ngia.hann12
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

The Nature ofECONOMETRICS

Econometrics
and Economic Data
Module Code: INS1064
Number of credits: 4

Pre-requisite(s): Theory of probability and mathematical statistics (MAT1004)

Teaching Language: English

Lecturer Information:

No Name Title Institution Email Phone

1. Trần Quang Tuyến Ph.D VNU-IS tuyenisvnu@gmail.com


0912474896

2. Lê Văn Đạo Master VNU-IS daoleisvnu@gmail.com


0394952064
The Nature of Econometrics
No Assessment items
Assessment methods
Value Notes

and Economic Data


1. -Regular Assessment
-Attendance and learning
20%
10% In-class and take-home exercises/
documents assignments: good presenting and writing.
-In-class and take-home 10%
exercises / assignments
2. -Midterm exam 20% One-hour written open-book exam
- Multiple choices
- Interpretation
3. -Final exam 60% The test consists of the group's project and an
oral examination lasting 15 minutes.

Each student receives a single grade based on


their entire written assignment and individual
oral presentation.
Total 100%
Required textbook:
1. Jeffrey M. Wooldridge, Introductory Econometrics: A Modern
Approach, 5th edition, Cengage Learning, 2016
References:
2. Damodar N. Gujarati, Dawn C. Porter, Basic Econometrics, 5th
edition, Mc Graw Hill, 2009.
The Nature of Econometrics CONTENTS
Chapter 1. The nature and methodology of econometrics

and
ChapterEconomic
2. Simple linear regressionData
(SLR)
Chapter 3. Multiple linear regression (MLR)

Chapter 4. Multiple linear regression model: Inference & asymptotics

Chapter 5. Further issues with multiple linear regression

Chapter 6: Regression models with dummy (binary) variables

Chapter 7: More on specification and data issues

Midterm exam: One-hour written open-book exam

Chapter 8: Basic regression analysis with time series data

Chapter 9: Further issues on using ordinary least squares method with time series
data

Chapter 10: Serial correlation and heteroskedasticity in time series regressions

Chapter 11: Carrying out an empirical project


The Nature of Econometrics
Chapter 1: Nature and methodology of econometrics
and Economic Data
1.1 The definition and purposes of econometrics

1.2. Methodology of econometrics

1.3. The significance of the error term

1.4. Types of economic data

1.5. Causality and the notion of ceteris paribus in


econometric analysis
Study case: Statistical relationship vs deterministic relationship
The Nature1.1. Definitionofand Econometrics
purposes of econometrics

and Economic Data


Econometrics can be defined as the social science that applies economic
theory, mathematics, and statistics to quantify economic phenomena.

Economic theory offers statements or hypotheses that are mostly qualitative

in nature.

Mathematical economics focuses on expressing economic relationships in

the form of mathematical equations, regardless of measurements or empirical

verification.

Economic statistics is mainly concerned with collecting, analyzing, and

presenting economic data (tables, figures, charts, etc.).


The Nature of Econometrics
and Economic
Common goals ofData
econometric analysis

Test economic theories and hypotheses.


Investigate relationships between economic
variables.
Forecast economic phenomena.
Assess government and business policies.
Testing an economic theory: Testing an economic theory:
Human capital theory Supply and demand
EEstimating the relationships between economic variables
ÊEstimating the relationship between socio-economic variabless
FForcasting economic phenomina
EvEvaluating policies implemented by the government or firms
1.2. Methodology of Econometrics

Generally speaking, traditional econometric methodogy proceeds along following


steps

Step 1. Formulating research questions/hypotheses

Step 2. Specification of a suitable economic model

Step 3. Turning the economic model into an econometric model

Step 4. Obtaining the data

Step 5. Estimating the parameters of the econometric model

Step 6. Testing the hypotheses

Step 7. Forcasting/policy implications


Step 1: Formulating research
questions/hypotheses
Examples:
Does job mismatch affect wage and job turnover?
Does greater household wealth make young children perform better?
Do mobile banner advertisements increase sales?
Does an increase in cigarette tax reduce cigarette consumption?

Step 2: Specifying a suitable economic model


This step is often skipped in empirical research

It may be micro or macro-models


Such models often base on optimizing behaviour or equilibrium

Some models establish relationships between economic variables: FDI & technology

transfer; CSR & firm performance,..


Step 2: Specifiying an economic model
Example 1:

The functional form was not specified (e.g., linear or non-linear)


The equation was proposed without a formal economic model
Step 2: Specifiying an economic model
Example 2:
Step 2. Economic model

What criteria are used to


choose variables for the
model?
1. Economic theory
2. Previous empirical studies
3. Intuition
How do I select relevant variables?
Tran (2014)
Step 3: Turning the economic model into an econometric model
Step 3: turning the economic model into an econometric model
Step 4: Obtaining the data
❑ Primary data are information that has
been collected directly by the researcher.

❑ Secondary data is information that


already exists and has been gathered by
other people or groups.
❑ The Vietnam Household Living Standard Survey (VHLSS); the Labour Force Survey; and
the Enterprise Census, which are conducted by the General Statistical Office (GSO).

❑ Other data available from the WB (World Bank), ILO (International Labor Organization),

and WTO (the World Trade Organization).


Step 5. Estimating the parameters of the
econometric model
❑ We employ various econometric techniques to estimate the population
parameters
Step 6. Hypothesis testing

❑ We have to test hypotheses about


population parameters (𝜷𝒋 ).

❑ Assuming the fitted model is a good


approximation of reality, we must construct
criteria to determine if our econometric
analysis estimates match the theory's
expectations.
Step 7. Forcasting and policy
implications
❑ The econometric results can be used to predict
the future value(s) of the dependent, or forecast,
variable Y based on the known or expected future
values of the explanatory variables (Xs).

❑ Some empirical findings offer useful information


for policymakers. This could aid in better policy
adjustment or intervention.
1.3. The significance of the error term
1.3.

Why does the error/disturbance term always exist?


❑ Ambiguity of theory: the list of factors that can affect Y is always
incomplete.
❑ Unavailability of data: Even though we know we omitted some
important factors that affect Y, we may not have data on them.
❑ Poor proxy variables : The error term also represents the errors of
measurement.
❑ Wrong functional form: Even if we have selected correct and relevant
variables (Xs) for our model, very often we do not know the form of
the functional relationship between Y and X.
❑ Principle of parsimony: It is better to keep our regression model as
simple as possible. Two or more X explaining a significant portion of Y
may be better than including many other variables without a strong
theoretical base.
1.4. Types of economic data
Four types of economic data sets

Cross-sectional data
Time series data
Pooled cross sections
Panel/Longitudinal data
Note: The selection of econometric methods depends on the type/nature of the data used.

The specification of inappropriate methods may provide misleading results.


Table 1.1: Cross-sectional data set on households in Hoai Duc District

Age of household head=54

Observation number: the 5th household Consumption per capita=1106.67


thousand VND/month

Indicator variable (1=poor;0=non-poor)


Table 1.2. Cross sectional data on countries’ GDP and education
Cross-sectional data sets
Random samples of individuals, families, enterprises,
cities, regions, nations,or other units of interest at a
given point of time/in a given period
Cross-sectional observations are more or less
independent
The pure random sampling is likely to be violated:, e.g.
respondents refuse to respond, or if the cluster
sampling is conducted.
Cross-sectional data is mostly applied in applied
microeconomics
Table 1.3 Time series data set on trade and tourism in Vietnam
(billion VND)
Năm Tổng số Bán lẻ Dịch vụ lưu trú, ăn uống Dịch vụ và du lịch
1990 19031.2 16747.4 2283.8 .
1991 33403.6 29183.3 4220.3 .
1992 51214.5 44778.3 6436.2 .
1993 67273.3 58424.4 8848.9 .
1994 93490 74091 11656 7743
1995 121160 94863 16957 9340
1996 145874 117547 18950 9377
1997 161899.7 131770.4 20523.5 9605.8
1998 185598.1 153780.6 21587.7 10229.8
1999 200923.7 166989 21672.1 12262.6
2000 220410.6 183864.7 23506.2 13039.7
2001 245315 200011 30535 14769
2002 280884 221569.7 35783.8 23530.5
2003 333809.3 262832.6 39382.3 31594.4
2004 398524.5 314618 45654.4 38252.1
2005 480293.5 373879.4 58429.3 47984.8
2006 596207.1 463144.1 71314.9 61748.1
2007 746159.4 574814.4 90101.1 81243.9
2008 1007213.5 781957.1 113983.2 111273.2
2009 1405864.6 1116477 158847.9 130540.1
2010 1677344.7 1254200 212065.2 211079.5
2011 2079523.5 1535600 260325.9 283597.6
2012 2369130.6 1740360 305651 323119.9
2013 2615203.6 1964667 315873.2 334663.9
2014 2916233.9 2189448 353306.5 373479
2015 3223202.6 2403723 399841.8 419637.6
2016 3546268.6 2648857 439892.3 457519.6
2017 3956599.1 2967485 488615.6 500498.8
2018 4393525.5 3308059 534168.5 551298
2019 4892114.39 3694560 595936.91 601617.59
2020 4847645.3 3815079 479715.67 552850.58
2021 4657066.28 3830560 379390.64 447115.82
Time series data
Observations of single variable or multiple variables over time
For example, GDP, inflation, stock prices, annual exchange rates,
agriculture sales, …
Such kind of data is mostly serially correlated (observations
are often not independent over time)=> requires more
advanced econometric techniques.
Observation order contains important information
Frequency: Daily, weekly, monthly, quarterly, annualy, …
Typical characteristics: trends and seasonality
Typical applications: applied macroeconomics and finance
Pooled cross sections
A combination of more than one cross-sectional data in

one data set

Cross sections are sampled independently of each other

Such kind of data is often used for assessing policy

changes

Example:

• Measure the effect of change in Hanoi‘ s expansion on house

prices

• Random sample of house prices for the year 2007

• A new random sample of house prices for the year 2009

• Compare before/after (2007: before expansion, 2009: after

expansion)
Table 1.4: Pooled cross sections on housing prices ( Woolridge, 2014)

Before reform

After reform
Table 1.5: Two-year panel data on provincial development statistics
Panel or longitudinal data
Data contain the same cross-sectional observations are followed over time

Such kind of data consists of a cross-sectional and a time series


dimension

Panel data enables researchers to eliminate time-invariant unobservables

Panel data can be used for models with lagged dependent variables

Example: Factors affecting provinces‘ economic growth


• Data on each province is observed in two or more years

• Time-invariant unobserved province characteristics ( that may affect


economic growth) can be modeled and removed

• Effect of government policy on growth may exhibit time lag


Panel data and unobservable time-invariant factors
𝑼: The error term: repesents all
unobservable factors

𝐸𝑛𝑔𝑙𝑖𝑠ℎ𝑖𝑡 = 𝛽0 + 𝛽1 𝐺𝐷𝑃𝑖𝑡 + 𝑒𝑖𝑡 + 𝑎𝑖


𝑒𝑖𝑡 represent unobservable factors that affect English scores but change over time
𝒂𝒊 : the unobservable factors that affect English scores in the 𝑝𝑟𝑜𝑣𝑖𝑛𝑐𝑒𝑖 but do not change over
time ( especially a short time)
𝑒.g., 𝒂𝒊 It represents social or historical traditions about studying and learning.
The omission of 𝑎𝑖 may cause the omited variable bias but we do not have data on it.
The key ideas is that any change in English score from 2021-2023 cannot be caused by 𝒂𝒊
because 𝒂𝒊 does not change during this period.
We have the regressions for 2023 and 2021
𝐸𝑛𝑔𝑙𝑖𝑠ℎ𝑖2023 = 𝛽0 + 𝛽1 𝐺𝐷𝑃𝑖2023 + 𝜷𝟐 𝒂𝒊 + 𝑒𝑖2023
𝐸𝑛𝑔𝑙𝑖𝑠ℎ𝑖2021 = 𝛽0 + 𝛽1 𝐺𝐷𝑃𝑖2021 + 𝜷𝟐 𝒂𝒊 + 𝑒𝑖2021
Then we make a difference
𝐸𝑛𝑔𝑙𝑖𝑠ℎ𝑖2023 − 𝐸𝑛𝑔𝑙𝑖𝑠ℎ𝑖2021 = 𝛽1 (𝐺𝐷𝑃𝑖2023 −𝐺𝐷𝑃𝑖2021 ) + (𝑒𝑖2023 − 𝑒𝑖2021 )
𝒂𝒊 is removed form differencing the two equations
Panel data allows for the elimination of unobservable time-invariant factors.
1.5. Causality and the notion of ceteris paribus
Ceteris paribus is a Latin phrase, showing an assumption that other (relevant)
factors being equal or held constant.

The notion of causal effect of X on Y :


“How does Y changes if X changed while all other factors are constant“.

❑ Most economic questions are ceteris paribus questions

In analyzing consumer behaviour ( micro economics), we have the law of demand:

“All other factors held constant, the higher the unit price of a good, the fewer the number
of units demanded by consumers and, consequently, sold by firms”Samuelson and Marks
(2009).
❑ The goal of econometric analysis is to infer that one variable (like education) causes
another (such as worker productivity). If other factors are not held fixed, then we
cannot know the causal effect of education on productivity.

❑ A careful application of econometric methods can simulate a ceteris paribus experiment.


TheNon Nature
experimental of Econometrics
data and econometrics
and

Economic
statistics, Data
Econometrics has developed into a separate discipline from mathematical
which typically analyzes nonexperimental data.
Non-experimental data
Experimental data
(Observational or retrospective data)

Non-experimental data, or observational data, is Data is collected in a controlled environment and


collected in a real-life setting. manipulated by the researcher.

It is often impossible for researchers to control the


conditions or variables of interest. The researcher can create environments and carefully
control the conditions and variables of interest.
Researchers are active collectors of data
Researchers are passive collectors of data.

Low or even no cost, especially when using secondary


Costly and sometimes unethical
data.

It is more difficult to determine a causal relationship. It is possible to determine a causal relationship.

Mostly in social sciences, Mostly in natural sciences


Experiments Natural experiments
The 2019 Nobel Prize in Economic The 2021 Nobel Prize in Economic
Sciences Sciences
Abhijit Banerjee, Esther Duflo, and David Card, Joshua Angrist and Guido
Michael Kremer Imbens
The experimental approach to alleviating New insights about the labour market and
global poverty shown what conclusions about cause and
effect can be drawn from natural
experiments.
Randomized controlled trials (RCT) are Natural experiments are studies designed
experimental studies that apply an in which the units of analysis are exposed
intervention to a random subset of the to as good as random variation caused by
target population so that the effects of nature, institutions, or policy changes.
the intervention may be compared to
those of a control group.

The researcher can create environments Researchers do not create natural


and carefully control the conditions and
experiments; rather, they find them.
variables of interest.

Natural experiments are observational


studies, not true experiments; the
researcher has little (if any) control over
the social conditions of the studies.
How can a controlled experiment be constructed to infer the causal
effect?
Causal effect of fertilizer on crop yield
„By how much will the rice output increase if one increases the amount
of fertilizer applied to the field“
It must be assumed that all other factors that affect rice yield such
as land quality, temperature, rainfall, crop diseases, etc. are held
fixed.
Experiment:
Select several one-acre plots of land; randomly assign different amounts
of fertilizer to the different plots and then compare the output.
In this case, the experiment works because the amount of fertilizer
applied is unrelated to other factors that affect rice yields.
In other word, the experiment helps isolate other factors than
fertilizer that affects rice yields.
Experiment and ethical issue
Causal effect of education on productivity
In order to estimate the causal effect of education on labour productivity, all other
factors that influence wages such as experience, innate ability, family background, etc.
are held fixed.

Problem without random assignment: nonexperimental or observational data


often suffers from self-selection or endogeneity.

E.g., education level is more likely to related to unobservables, such as innate ability.
People with higher abilities, for example, tend to have higher levels of education.

An experiment can make sure that education is unrelated to other factors that
affect wages. E.g., choose a group of children, making sure that different levels of
education are randomly assigned to them. Finally, compare the wage outcomes.

Is this experiment unethical?


With non-experimental data, discovering causality is very challenging.

But it is infeasible to conduct an expriment due to ethical issues.


SEconometrics: statistical relationships
Statistical relationship Deterministic relationship

Variables are random or stochastic and Variables are not random or stochastic.
have probability distributions.

There are errors in measurement. There are no errors in measurement.

The relationship between economic There is an exact relationship


variables are generally inexact between variables.

E.g., the rate of return to education is found E.g.,Newton’s law of gravity


to be 12.6% in Thailand but 10.3% in China. 𝑀1, 𝑀2
𝐹=𝐺
𝑟2
𝑵𝒎𝟐
𝐺 =6,67*10-11 𝟐
𝒌𝒈
Exercises 1
1. What are the differences between economic and econometric models?
2. What is your comment about this statement? "An econometric model is always derived from a formal
economic model."
3. Why does observational data often not guarantee the assumption "ceteris paribus"?
4. In a study on the effect of fertilizer on rice productivity, more fertilizer is used in less fertile plots, but
we do not have data on land fertility. If we found a positive link between fertilizer and rice yields,
would we have convincingly concluded that fertilizer makes rice production more productive?
5. Say you have to do research on whether violent video games cause school violence among students.
Is it feasible? Why?
6. Can you infer a casual effect of violent video games on school violence if your research based on
observational data? explain why?
7. Name other factors other than violent video games that can affect school violence. Name some factors
that can be measurable and unmeasurable.
8. Each group comes up with a hypothesis or a research question and then specifies an economic model
and the variables that are relevant to it.
A. Please explain why these variables were chosen.
B. How to measure or collect data for these variables
C. What is the expected direction of the effect of each variable?
9. Please specify variables for a economic model of corruption in Vietnam.
A. Please explain why these variables were chosen.
B. How to measure or collect data for these variables
C. What is the expected direction of the effect of each variable?

You might also like