0% found this document useful (0 votes)
23 views4 pages

R Practical Ecotrix

The document outlines the R programming procedures for data handling, model estimation, and diagnostic testing in ecotrix analysis. It details the use of various libraries and functions for tasks such as importing data, performing descriptive statistics, fitting linear and dynamic models, and testing for assumptions like normality, heteroscedasticity, and multicollinearity. A complete flow of a typical analysis is also provided, summarizing the steps from data import to model selection.

Uploaded by

phtsruth
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views4 pages

R Practical Ecotrix

The document outlines the R programming procedures for data handling, model estimation, and diagnostic testing in ecotrix analysis. It details the use of various libraries and functions for tasks such as importing data, performing descriptive statistics, fitting linear and dynamic models, and testing for assumptions like normality, heteroscedasticity, and multicollinearity. A complete flow of a typical analysis is also provided, summarizing the steps from data import to model selection.

Uploaded by

phtsruth
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

R Practical Ecotrix

Setup & Data Handling


library(package_name)
 Purpose: Loads external packages with specialized functions.
 Packages used:
o DescTools (summary stats, residual diagnostics)

o dynlm (dynamic linear models for time series)

o ecm (error correction models)

o lmtest (model tests: RESET, Breusch-Pagan, etc.)

o readxl (import Excel files)

o tseries (JB test, etc.)

o wooldridge (access Wooldridge datasets)

data('dataset_name')
 Purpose: Loads datasets from packages, e.g., wage1 from wooldridge.
read_excel("file_path") (from readxl)
 Purpose: Imports data from Excel files into R.

📊 Descriptive Statistics & Plots


Desc(variable) (DescTools)
 Purpose: Provides comprehensive descriptive statistics.
 Use: Initial data exploration to understand distribution, outliers, etc.
hist(variable)
 Purpose: Plots histogram of a variable.
 Use: Used to visually check normality or distribution.
scatter.smooth(x, y)
 Purpose: Plots a scatterplot with a smoothed regression line.
 Use: Visual diagnostic for linearity.

📈 Model Estimation
lm(y ~ x1 + x2, data = dataset)
 Purpose: Fits a linear regression model.
 Use: Core command for cross-sectional or time series OLS models.
 Assumptions checked: Linearity, homoscedasticity, no autocorrelation,
no multicollinearity, normality of residuals.
dynlm(y ~ L(y, 1) + x, data = dataset) (dynlm)
 Purpose: Fits dynamic linear models including lags.
 Use: Specifically for time-series models with lagged terms.
gls(y ~ x1 + x2, data = dataset) (from nlme)
 Purpose: Generalized Least Squares estimation.
 Use: Remedy for autocorrelation or heteroscedasticity.
 Assumption: Adjusts for non-spherical error structure.
lag(variable, -1)
 Purpose: Creates lagged variables manually (first lag, etc.)
 Use: Needed in time series for dynamic models or Durbin’s h test.
ts(variable, start=c(), end=c(), frequency=12)
 Purpose: Creates a time-series object.
 Use: Required for time-series modeling; monthly data use frequency = 12.

🔍 Residuals and Normality Tests


resid(model)
 Purpose: Extract residuals from a regression model.
jarque.bera.test(variable) (tseries)
 Purpose: Tests for normality (null: normal distribution).
 Use: For raw variables or residuals.
 Assumption checked: Normality of residuals.

🔍 Model Specification Tests


resettest(model, power=2:3, type="fitted", data=dataset) (lmtest)
 Purpose: Ramsey RESET test for functional form misspecification.
 Use: Checks for omitted variables or wrong functional form.
 Assumption: Correct specification of the model.
petest(model1, model2) (ecm)
 Purpose: Compares linear vs. log-linear models using PE test.
 Use: Tests which functional form fits better.
 Assumption checked: Correct model specification.

🧪 Heteroscedasticity Tests
bptest(model, studentize=FALSE) (lmtest)
 Purpose: Breusch-Pagan test for heteroscedasticity.
 Use: Checks constant variance of residuals (null: homoscedastic).
 Assumption checked: Homoscedasticity.
bptest(model, ~x1*x2 + I(x1^2) + I(x2^2), data=dataset) (White’s
Test)
 Purpose: White’s test for heteroscedasticity (general form).
 Use: More flexible than BP test; includes interaction and nonlinear terms.

🛠 Heteroscedasticity Remedies
lm(y ~ x1 + x2, weights = 1/educ)
 Purpose: Weighted Least Squares (WLS).
 Use: Remedy for heteroscedasticity when variance is related to an
explanatory variable.
 Assumption: Known form of heteroscedasticity.

🔄 Autocorrelation Tests and Remedies


durbinH(model, variable) (lmtest)
 Purpose: Durbin’s h test for autocorrelation in presence of lagged
dependent variable.
 Use: Time series models with lagged dependent terms.
 Assumption checked: No autocorrelation.
bgtest(model, order=2) or bgtest(model, order=3) (lmtest)
 Purpose: Breusch-Godfrey test for higher-order autocorrelation.
 Use: More flexible than Durbin-Watson; allows lagged dependent
variables.
 Assumption checked: No serial correlation.

🔍 Multicollinearity Detection
cor(dataset)
 Purpose: Correlation matrix.
 Use: Preliminary check for multicollinearity.
 Interpretation: High correlation between explanatory variables may
suggest multicollinearity.
vif(model) (car package recommended)
 Purpose: Variance Inflation Factor (VIF).
 Use: Diagnoses multicollinearity (rule of thumb: VIF > 10 is problematic).
 Assumption checked: No multicollinearity.

✅ Model Comparison & Validation


anova(model1, model2)
 Purpose: Compares nested models.
 Use: Choose better model using F-test or WLS models comparison.
plot(model)
 Purpose: Diagnostic plots of residuals, fitted values, etc.
 Use: Check linearity, heteroscedasticity, influential points.

✅ Complete Flow of a Typical Analysis


1. Data Import & Description: read_excel(), Desc(), hist(),
scatter.smooth()
2. Model Estimation: lm(), dynlm(), gls()
3. Residual Analysis: resid(), hist(), jarque.bera.test()
4. Heteroscedasticity: bptest(), White’s Test, WLS
5. Autocorrelation: durbinH(), bgtest(), gls()
6. Multicollinearity: cor(), vif()
7. Specification Errors: resettest(), petest()
8. Model Selection: anova(), plot()

You might also like