0% found this document useful (0 votes)

25 views34 pages

Introduction To Stasmodels

The document provides an introduction to Statsmodels, a Python library for statistical modeling and hypothesis testing, highlighting its importance and comparison with other libraries. It covers fundamental statistical concepts, including model types, parameters, and hypothesis testing, along with practical examples using linear regression. Additionally, it discusses data preparation and exploratory data analysis techniques essential for effective statistical modeling.

Uploaded by

mesquins.parasite.0k

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views34 pages

Introduction To Stasmodels

Uploaded by

mesquins.parasite.0k

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 34

Introduction to Statsmodels

Module 1: Introduction to Statsmodels

Lecture 1.1: Introduction to Statsmodels

Overview of Statsmodels and Its Importance
Statsmodels is a powerful Python library designed for estimating and testing
statistical models. It plays a crucial role in statistical analysis by offering tools
that extend beyond the capabilities of other libraries like NumPy or SciPy. With
Statsmodels, users can:

Estimate a wide variety of statistical models, including linear regression,

generalized linear models, time series models, and more.

Perform statistical tests such as hypothesis testing and confidence interval

estimation.

Explore and visualize data to gain insights before modeling.

The importance of Statsmodels lies in its ability to handle complex statistical

analyses with a user-friendly interface. It allows both beginners and
experienced statisticians to specify models, fit them to data, and interpret
results with ease. This makes it an essential tool for data scientists,
researchers, and analysts working with statistical data in Python.
Brief History and Development
Statsmodels was initiated in 2009 by Jonathan Taylor as part of the SciPy
library. Due to its growing complexity and the need for specialized
development, it was later separated into its own library. Today, Statsmodels is
maintained by a dedicated team of developers and is widely used in the
scientific community for its robust statistical capabilities.
Comparison with Other Statistical Libraries in Python
While Python offers several libraries for statistical analysis, Statsmodels stands
out for its focus on statistical modeling and hypothesis testing. Below is a
comparison with other popular libraries:

NumPy and SciPy: These libraries provide basic statistical functions and
tools for numerical computing. However, Statsmodels offers more

Introduction to Statsmodels 1
advanced tools for estimating and interpreting statistical models.

Patsy: Patsy is primarily used for describing statistical models and building
design matrices, but it lacks the modeling and testing capabilities of
Statsmodels.

Seaborn: Seaborn is focused on data visualization, particularly for

statistical data, but it does not provide tools for model estimation or
hypothesis testing.

R’s Statistical Packages: Statsmodels is often compared to R’s statistical

capabilities, as it brings similar functionality to Python, making it a great
alternative for those who prefer Python’s syntax and ecosystem.

In summary, Statsmodels fills a critical gap in Python’s statistical analysis

capabilities by providing a comprehensive suite of tools for statistical modeling,
testing, and data exploration.

Lecture 1.2: Basic Concepts and

Terminology
Before diving into the specifics of Statsmodels, it is essential to understand the
fundamental concepts and terminology used in statistical modeling. This
lecture covers the basics of statistical models, key concepts such as
parameters and residuals, and common statistical terms.

Understanding Statistical Models and Their Types

A statistical model is a mathematical representation of the relationship
between variables. It is used to describe, explain, or predict phenomena based
on data. Different types of models are suited to different kinds of data and
research questions. Some common types include:

Linear Regression: Models the linear relationship between a dependent

variable and one or more independent variables. It is used when the
response variable is continuous.

• Example: Predicting house prices based on features like size and location.

Logistic Regression: Used for binary classification problems where the

dependent variable is categorical (e.g., yes/no, 0/1).

• Example: Predicting whether a customer will buy a product based on their

demographics.

Introduction to Statsmodels 2
Time Series Models: Designed for data collected over time, such as stock
prices or weather data. These models account for temporal dependencies.

• Example: Forecasting future sales based on historical data.

Each model type comes with its own assumptions and is selected based on the
nature of the data and the specific research question.

Key Concepts

Parameters: These are the coefficients in a statistical model that define the
relationship between the variables. For example, in a linear regression
model y = \beta_0 + \beta_1 x + \epsilon , \beta_0 (intercept) and \beta_1
(slope) are parameters.

Estimates: These are the values of the parameters calculated from the
data. They are used to make predictions or inferences about the population
from which the data was drawn.

Residuals: Residuals are the differences between the observed values and
the values predicted by the model. They are crucial for assessing the fit of
the model. A good model will have residuals that are randomly distributed
with no clear pattern.

Hypothesis Testing: A method used to test whether there is enough

evidence to reject a null hypothesis (e.g., whether a parameter is
significantly different from zero). It involves calculating a test statistic and
comparing it to a critical value or using a p-value.

Common Statistical Terminology

P-value: The probability of observing a test statistic as extreme as the one

calculated, assuming the null hypothesis is true. A small p-value (typically <
0.05) suggests that the null hypothesis can be rejected.

Confidence Interval: A range of values within which the true parameter is

likely to lie, with a specified level of confidence (e.g., 95%). It provides a
measure of the precision of the estimate.

Standard Error: A measure of the variability of an estimate. It indicates how

much the estimate would vary if the experiment were repeated multiple
times.

Understanding these concepts is critical for interpreting the results of statistical

analyses and for using Statsmodels effectively.

Introduction to Statsmodels 3
Example: Linear Regression with Statsmodels

To illustrate these concepts, let’s consider a simple linear regression example

using Statsmodels.

Code Snippet

import statsmodels.api as sm
import numpy as np

# Generate some sample data

np.random.seed(0)
X = np.random.rand(100, 1) # Independent variable
y = 2 + 3 * X + np.random.randn(100, 1) # Dependent variable with noise

# Add a constant to the independent variable (for the intercept)

X = sm.add_constant(X)

# Fit the linear regression model

model = sm.OLS(y, X).fit()

# Print the summary of the model

print(model.summary())

Explanation

In this example:

We generate sample data where the true relationship is y = 2 + 3x +

\epsilon , with \epsilon representing random noise.

We use Statsmodels’ OLS (Ordinary Least Squares) function to fit a linear

regression model to the data.

The summary() function provides a detailed output, including:

• Estimates of the parameters (intercept and slope).

• Standard errors of the estimates.

• P-values for testing whether each parameter is significantly different from

zero.

• R-squared, which measures the goodness of fit.

Introduction to Statsmodels 4
This example demonstrates how Statsmodels can be used to estimate a model,
obtain parameter estimates, and perform hypothesis tests—all essential steps
in statistical analysis.

Exercises for Reinforcement

To solidify your understanding of the concepts covered in this module, try the
following exercises:

1. What is the difference between a parameter and an estimate in a statistical

model?

• Hint: Think about the true value versus the value calculated from data.

2. Explain what residuals are and why they are important.

• Hint: Consider how residuals help assess model fit.

3. Using the linear regression example provided, interpret the p-value for the
slope coefficient.

• Hint: What does a small p-value indicate about the slope?

Summary

This module has provided a comprehensive introduction to Statsmodels and the

basic concepts necessary for statistical modeling. In Lecture 1.1, we explored
what Statsmodels is, its importance, and how it compares to other statistical
libraries in Python. In Lecture 1.2, we covered fundamental statistical concepts
such as parameters, estimates, residuals, and hypothesis testing, along with
common terminology like p-values and confidence intervals. The linear
regression example demonstrated how these concepts are applied using
Statsmodels.

Module 2: Data Preparation and

Exploration

Introduction
Data preparation and exploration are foundational steps in statistical analysis
and modeling. Preparing data involves loading it from various sources, cleaning
it by addressing missing values and inconsistencies, and transforming it to suit
analytical needs. Exploratory Data Analysis (EDA) allows us to summarize and
visualize the data, revealing its patterns, distributions, and potential issues like

Introduction to Statsmodels 5
outliers. These steps ensure the data is reliable and well-understood before
applying statistical models, such as those in Statsmodels.
In this module, we’ll explore techniques for loading and manipulating data,
followed by methods for conducting EDA, using Python libraries like Pandas,
Statsmodels, Matplotlib, and Seaborn. We’ll use the Iris dataset from
Statsmodels for consistent examples.

Lecture 2.1: Loading and Manipulating

Data
This lecture focuses on getting data into a usable format and preparing it for
analysis.

Importing Data from Various Sources

Statsmodels integrates smoothly with Pandas DataFrames, making it easy to
import data from different formats:

CSV Files: Load data from a CSV file using pandas.read_csv().

import pandas as pd
data = pd.read_csv('path/to/your/file.csv')

• Excel Files: Use pandas.read_excel() for Excel files

data = pd.read_excel('path/to/your/file.xlsx')

Pandas DataFrames: If data is already in a DataFrame, it can be used

directly with Statsmodels.

For our examples, we’ll use the iris dataset from Statsmodels:

import statsmodels.api as sm
iris = sm.datasets.get_rdataset('iris').data

The Iris dataset includes sepal and petal measurements for three iris flower
species, providing a rich dataset for demonstration.

Handling Missing Data and Data Cleaning Techniques

Real-world data often has imperfections that must be addressed:

Introduction to Statsmodels 6
Identifying Missing Values: Check for missing data with isnull().

missing_values = iris.isnull().sum()
print(missing_values)

• Dropping Missing Values: Remove rows with missing data if they’re minimal.

iris_clean = iris.dropna()

• Imputing Missing Values: Replace missing numerical values with the mean,
median, or mode.

iris['sepal length'].fillna(iris['sepal length'].mean(), inplace=True)

• Removing Duplicates: Eliminate duplicate rows to ensure data integrity.

iris = iris.drop_duplicates()

• Correcting Data Types: Ensure columns have appropriate types, e.g.,

categorical data.

iris['species'] = iris['species'].astype('category')

These steps create a clean dataset ready for further manipulation.

Data Transformation and Feature Engineering

Transforming data can improve model performance by meeting assumptions or
enhancing features:

Transforming Variables: Use functions like logarithm or square root to

adjust distributions.

import numpy as np
iris['log_sepal_length'] = np.log(iris['sepal length'])

• Creating Interaction Terms: Combine variables to capture combined effects.

iris['sepal_petal_interaction'] = iris['sepal length'] * iris['petal length']

Introduction to Statsmodels 7
• Generating Polynomial Features: Add polynomial terms for non-linear
relationships.

from sklearn.preprocessing import PolynomialFeatures

poly = PolynomialFeatures(degree=2, include_bias=False)
poly_features = poly.fit_transform(iris[['sepal length', 'sepal width']])

These techniques prepare the data for more accurate statistical modeling.

Lecture 2.2: Exploratory Data Analysis

(EDA)
EDA helps us understand the data’s structure and characteristics before
modeling.

Summary Statistics
Summary statistics provide insights into data’s central tendencies and spread:

Using describe() in Pandas: Get a quick overview of numerical columns.

summary = iris.describe()
print(summary)

This outputs count, mean, standard deviation, min, max, and quartiles.

Specific Statistics: Calculate individual measures as needed.

mean_sepal_length = iris['sepal length'].mean()

median_petal_width = iris['petal width'].median()

Data Visualization

Visualization is key to EDA, and while Statsmodels offers some plotting,

Matplotlib and Seaborn provide greater flexibility:

Histograms for Distribution: Show the spread of a variable.

import matplotlib.pyplot as plt

import seaborn as sns
sns.histplot(iris['sepal length'], kde=True)

Introduction to Statsmodels 8
plt.title('Distribution of Sepal Length')
plt.show()

• Scatter Plots for Relationships: Examine how variables interact.

sns.scatterplot(x='sepal length', y='petal length', data=iris, hue='species')

plt.title('Sepal Length vs Petal Length')
plt.show()

• Box Plots for Outliers: Highlight outliers and compare groups.

sns.boxplot(x='species', y='sepal width', data=iris)

plt.title('Sepal Width by Species')
plt.show()

These plots reveal distributions, relationships, and anomalies visually.

Understanding Data Distributions and Relationships

Understanding the data’s properties guides model selection:

Checking for Normality: Test if data follows a normal distribution, often

assumed in models like linear regression.

from scipy.stats import shapiro

stat, p = shapiro(iris['sepal length'])
print('Shapiro-Wilk Test: Statistics=%.3f, p=%.3f' % (stat, p))

A p-value > 0.05 suggests normality.

Exploring Correlations: Measure relationships between numerical

variables.

correlation_matrix = iris.corr()
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm')
plt.title('Correlation Matrix')
plt.show()

Identifying Outliers and Anomalies

Outliers can distort analysis and must be detected:

Introduction to Statsmodels 9
Visual Methods: Use plots to spot unusual values.

sns.boxplot(x=iris['sepal width'])
plt.title('Box Plot of Sepal Width')
plt.show()

• Statistical Methods: Apply the Interquartile Range (IQR) method.

Q1 = iris['sepal width'].quantile(0.25)
Q3 = iris['sepal width'].quantile(0.75)
IQR = Q3 - Q1
outliers = iris[(iris['sepal width'] < (Q1 - 1.5 * IQR)) | (iris['sepal width'] > (Q
3 + 1.5 * IQR))]
print(outliers)

Decide whether to remove or adjust outliers based on their impact and context.

Summary
This module covered critical steps in data preparation and exploration. In
Lecture 2.1, we learned to load data from CSV, Excel, and DataFrames, clean it
by handling missing values and duplicates, and transform it through feature
engineering. In Lecture 2.2, we explored EDA with summary statistics,
visualizations, distribution analysis, and outlier detection. Using the Iris dataset,
we demonstrated these concepts with practical Python code, leveraging
Statsmodels, Pandas, Matplotlib, and Seaborn.

Module 3: Linear Regression Models

Introduction
Linear regression is a foundational statistical technique used to model the
relationship between a dependent variable and one or more independent
variables. It is widely applied in prediction, hypothesis testing, and
understanding variable relationships. This module explores two key types of
linear regression—Simple Linear Regression and Multiple Linear Regression—
using the statsmodels library in Python.

Introduction to Statsmodels 10
We will use the Boston Housing dataset, which includes variables such as
median home value (medv), crime rate (crim), and average number of rooms
(rm), to illustrate the concepts.

Lecture 3.1: Simple Linear Regression

Introduction to Simple Linear Regression Using OLS
Simple Linear Regression models the relationship between one independent
variable and one dependent variable with a linear equation. It assumes a
straight-line relationship between the variables.
The model is expressed as: [y = β0 + β1 x + ϵ]

( y ): Dependent variable

( x ): Independent variable

( \beta_0 ): Intercept (value of ( y ) when ( x = 0 ))

( \beta_1 ): Slope (change in ( y ) per unit change in ( x ))

( \epsilon ): Error term

Ordinary Least Squares (OLS) estimates ( \beta_0 ) and ( \beta_1 ) by

minimizing the sum of squared residuals (differences between observed and
predicted values).
Model Specification, Estimation, and Interpretation
Specifying the Model

In statsmodels, we specify the model using a formula syntax. For example, to

model medv as a function of crim, the formula is 'medv ~ crim'.

Estimating the Model

Here’s how to fit the model using the Boston Housing dataset:

import statsmodels.api as sm

import statsmodels.formula.api as smf

# Load the dataset

boston = sm.datasets.get_rdataset('Boston', 'MASS').data

Introduction to Statsmodels 11
# Specify and fit the model

model = smf.ols('medv ~ crim', data=boston).fit()

# View results

print(model.summary())

Interpreting the Results

The output includes:

Intercept (( \beta_0 )): Predicted medv when crim is 0.

Slope (( \beta_1 )): Change in medv for a one-unit increase in crim. A

negative value suggests higher crime rates reduce home values.

P-value: Tests if the coefficient is significantly different from zero (typically,

p < 0.05 indicates significance).

R-squared: Proportion of variance in medv explained by crim (0 to 1; higher

is better).

Understanding Coefficients, R-squared, and Residual Plots

Coefficients: Quantify the relationship between variables. For example, a

slope of -0.42 for crim means medv decreases by 0.42 units per unit
increase in crim.

R-squared: Measures model fit. An R-squared of 0.15 means 15% of the

variability in medv is explained by crim.

Residual Plots

Residuals (observed minus predicted values) help validate model assumptions:

Linearity: Residuals should scatter randomly around zero.

Homoscedasticity: Residual variance should be consistent across

predicted values.

Here’s how to create a residual plot:

import matplotlib.pyplot as plt

Introduction to Statsmodels 12
# Predicted values and residuals
predictions = model.predict(boston['crim'])
residuals = boston['medv'] - predictions

# Plot
plt.scatter(predictions, residuals)

plt.axhline(0, color='red', linestyle='--')

plt.xlabel('Predicted Values')
plt.ylabel('Residuals')
plt.title('Residual Plot')

plt.show()

A random scatter supports the model’s assumptions; patterns suggest issues.

Lecture 3.2: Multiple Linear Regression

Extending Simple Linear Regression to Multiple Linear Regression
Multiple Linear Regression models the relationship between a dependent
variable and multiple independent variables. The equation is: [y = β0 +

β1 x1 + β2 x2 + ⋯ + βk xk + ϵ]
where ( x_1, x_2, \dots, x_k ) are independent
variables.
Handling Categorical Variables and Interaction Terms
Categorical Variables
Categorical variables are included as dummy variables. In statsmodels, use C()
in the formula. For example, chas (1 if tract bounds the Charles River, 0
otherwise) is included as C(chas).
Interaction Terms
Interaction terms model how the effect of one variable depends on another. For
example, crim:chas tests if the effect of crim on medv varies by chas.
Example Model
Let’s model medv with crim, rm, and chas:

Introduction to Statsmodels 13
# Specify and fit the model

multi_model = smf.ols('medv ~ crim + rm + C(chas)', data=boston).fit()

# View results

print(multi_model.summary())

Coefficients: Interpret each holding other variables constant.

C(chas)[T.1]: Effect of chas = 1 vs. chas = 0.

Model Diagnostics and Validation Techniques

Multicollinearity

Multicollinearity (high correlation between independent variables) can distort

coefficients. Check it with Variance Inflation Factors (VIF):
from statsmodels.stats.outliers_influence import variance_inflation_factor

# Prepare data
X = boston[['crim', 'rm', 'chas']]
X = sm.add_constant(X)

# Calculate VIF
vif = pd.DataFrame()
vif['Variable'] = X.columns
vif['VIF'] = [variance_inflation_factor(X.values, i) for i in range(X.shape[1])]

print(vif)

A VIF > 10 indicates potential multicollinearity; consider removing variables.

Cross-Validation
Cross-validation tests model performance on unseen data. Using scikit-learn:

from sklearn.model_selection import cross_val_score

from sklearn.linear_model import LinearRegression

# Data

Introduction to Statsmodels 14
X = boston[['crim', 'rm', 'chas']]
y = boston['medv']

# Model and 5-fold cross-validation

lr = LinearRegression()
mse = cross_val_score(lr, X, y, cv=5, scoring='neg_mean_squared_error')

print('Mean MSE:', -mse.mean())

A lower MSE indicates better predictive performance.

Summary
This module covered:

Simple Linear Regression: Using OLS in statsmodels to model a single

predictor, interpret coefficients and R-squared, and check residuals.

Multiple Linear Regression: Extending to multiple predictors, handling

categorical variables and interactions, and validating with diagnostics like
VIF and cross-validation.

Module 4: Statistical Inference and

Hypothesis Testing

Introduction
Statistical inference enables us to draw conclusions about a population from
sample data. In this module, we explore hypothesis testing and confidence
intervals to evaluate relationships in linear regression, and model comparison
techniques to select the best model. We’ll use the Boston Housing dataset
(available in statsmodels), which includes variables like median home value
(medv), crime rate (crim), and average number of rooms (rm), to illustrate these
concepts.

Lecture 4.1: Hypothesis Testing and

Confidence Intervals

Introduction to Statsmodels 15
Understanding Hypothesis Testing and Confidence Intervals in the Context
of Linear Regression
In linear regression, hypothesis testing assesses whether an independent
variable significantly affects the dependent variable. For each coefficient ((βi ))

Null Hypothesis ((H0 )): (βi

= 0) , meaning the variable has no effect.

Alternative Hypothesis ((Ha )): (βi

=
 0) , meaning the variable has a
significant effect.

The p-value indicates the probability of observing the data if (H_0) is true. A p-
value < 0.05 typically leads to rejecting (H_0), suggesting a significant
relationship.

Confidence intervals (CIs) provide a range within which the true coefficient
likely lies, with a specified confidence level (e.g., 95%). If the CI excludes zero,
the coefficient is significant.
Using `statsmodels` to Perform Hypothesis Tests and Construct Confidence
Intervals
Let’s fit a simple linear regression model using medv as the dependent variable
and crim as the predictor:

import statsmodels.api as sm
import statsmodels.formula.api as smf

# Load the Boston Housing dataset

boston = sm.datasets.get_rdataset('Boston', 'MASS').data

# Fit the model

model = smf.ols('medv ~ crim', data=boston).fit()

# View the summary

print(model.summary())

In the output:

The p-value for crim (underP > ∣t∣)tests(H0 : βcrim = 0)

. If p < 0.05,
crim significantly affects medv.

Introduction to Statsmodels 16
The 95% CI for crim (under [0.025 0.975]) shows the range of plausible
values for (βcrim ).

To extract the CI programmatically:

conf_int = model.conf_int(alpha=0.05)

print(conf_int)

Interpreting Results and Making Inferences

Hypothesis Testing: If the p-value for crim is 0.001, we reject (H0 )and

conclude that crime rate significantly impacts home values.

Confidence Intervals: If the CI for crim is [-0.5, -0.3], we are 95%

confident that each unit increase in crime rate reduces home value by 0.3
to 0.5 units. Since the CI excludes zero, the effect is significant.

These tools help us determine which variables matter in our model.

Lecture 4.2: Model Comparison and

Selection
Comparing Models Using Metrics

To choose the best regression model, we compare them using key metrics:

R-squared ((R^2)): The proportion of variance in the dependent variable

explained by the model. Higher values indicate better fit, but (R^2)
increases with more predictors, even if they’re irrelevant.

Adjusted R-squared: Adjusts (R^2) for the number of predictors, penalizing

unnecessary complexity. Use this for fair comparisons.

Akaike Information Criterion (AIC): Balances fit and complexity. Lower AIC
suggests a better model.

Bayesian Information Criterion (BIC): Similar to AIC but penalizes

complexity more heavily. Lower BIC is preferred.

Let’s compare two models:

1. Model 1: medv ~ crim

2. Model 2: medv ~ crim + rm

Introduction to Statsmodels 17
# Fit Model 1

model1 = smf.ols('medv ~ crim', data=boston).fit()

# Fit Model 2
model2 = smf.ols('medv ~ crim + rm', data=boston).fit()
# Compare metrics
print(f"Model 1 R-squared: {model1.rsquared:.3f}, Adjusted R-squared:
{model1.rsquared_adj:.3f}")
print(f"Model 2 R-squared: {model2.rsquared:.3f}, Adjusted R-squared:
{model2.rsquared_adj:.3f}")
print(f"Model 1 AIC: {model1.aic:.2f}, BIC: {model1.bic:.2f}")
print(f"Model 2 AIC: {model2.aic:.2f}, BIC: {model2.bic:.2f}")

Interpretation: If Model 2 has higher adjusted R-squared and lower

AIC/BIC, it’s likely superior due to better fit and reasonable complexity.

Model Selection Techniques

Stepwise Regression

This method iteratively adds or removes variables based on criteria like p-

values or AIC. While useful, it risks overfitting if not validated.
Cross-Validation
Cross-validation evaluates a model’s performance on unseen data, ensuring it
generalizes well. Here’s an example using 5-fold cross-validation:
from sklearn.model_selection import cross_val_score

from sklearn.linear_model import LinearRegression

# Prepare data for Model 2
X = boston[['crim', 'rm']]
y = boston['medv']
# Perform cross-validation

lr = LinearRegression()
mse = cross_val_score(lr, X, y, cv=5, scoring='neg_mean_squared_error')
print(f"Mean MSE: {-mse.mean():.2f}")
A lower mean squared error (MSE) indicates better predictive accuracy.

Introduction to Statsmodels 18
Avoiding Overfitting and Underfitting

Overfitting: The model is too complex, fitting noise instead of the true
pattern. It performs well on training data but poorly on new data.

Underfitting: The model is too simple, missing key patterns. It performs

poorly on all data.

To balance these:

Use adjusted R-squared, AIC, or BIC to penalize unnecessary complexity.

Apply cross-validation to test generalization.

Ensure the number of predictors is appropriate for the sample size.

Summary
This module covered:

Lecture 4.1: Using hypothesis testing (p-values) and confidence intervals to

assess variable significance in linear regression.

Lecture 4.2: Comparing models with metrics (R-squared, AIC, BIC) and
selecting the best one using techniques like stepwise regression and cross-
validation, while avoiding overfitting and underfitting.

With these skills, you can confidently analyze regression models using
statsmodels and make data-driven decisions!

Module 5: Time Series Analysis

Introduction
Time series analysis is essential for understanding and forecasting data
collected over time, such as stock prices, weather patterns, or sales figures.
This module introduces the core concepts of time series data and
demonstrates how to model and forecast it using statsmodels. We’ll use the
AirPassengers dataset, which records monthly airline passenger numbers
from 1949 to 1960, to illustrate key techniques.

Introduction to Statsmodels 19
Lecture 5.1: Introduction to Time Series
Analysis
Understanding Time Series Data and Its Characteristics
A time series is a sequence of data points recorded at regular time intervals.
Time series data often exhibits:

Trend: A long-term increase or decrease in the data.

Seasonality: Repeating patterns at fixed intervals (e.g., monthly or yearly).

Cyclicality: Fluctuations without a fixed period, often tied to economic

cycles.

The AirPassengers dataset, for example, shows both an upward trend and
seasonal fluctuations.
Basic Concepts: Stationarity, Autocorrelation, and Partial Autocorrelation

Stationarity: A time series is stationary if its statistical properties (mean,

variance) are constant over time. Many models, like ARIMA, assume
stationarity. To test for stationarity, we use the Augmented Dickey-Fuller
(ADF) test:

Null Hypothesis H0 : The series is non-stationary.

Alternative Hypothesis Ha : The series is stationary.

If the p-value < 0.05, we reject H0 and conclude the series is

stationary.

Autocorrelation (ACF): Measures the correlation between a time series and

its lagged values. It helps identify patterns and dependencies.

Partial Autocorrelation (PACF): Measures the correlation between a time

series and its lagged values, controlling for shorter lags. It’s useful for
determining the order of autoregressive terms in models.

Visualizing Time Series Data

Visualizations are crucial for understanding time series data:

Time Series Plot: Displays the data over time to reveal trends and
seasonality.
import statsmodels.api as sm

Introduction to Statsmodels 20
import matplotlib.pyplot as plt

# Load the AirPassengers dataset

air_passengers = sm.datasets.get_rdataset('AirPassengers').data
air_passengers['date'] = pd.to_datetime(air_passengers['time'].apply(lamb
da x: f"{int(x)}-{int((x % 1) * 12) + 1}-01"))
air_passengers.set_index('date', inplace=True)

# Plot the time series

plt.plot(air_passengers['value'])
plt.title('AirPassengers Time Series')
plt.xlabel('Year')
plt.ylabel('Passengers')
plt.show()

ACF Plot: Shows autocorrelation at different lags.

from statsmodels.graphics.tsaplots import plot_acf

plot_acf(air_passengers['value'], lags=40)
plt.title('Autocorrelation Function (ACF)')
plt.show()

PACF Plot: Shows partial autocorrelation at different lags.

from statsmodels.graphics.tsaplots import plot_pacf

plot_pacf(air_passengers['value'], lags=40)
plt.title('Partial Autocorrelation Function (PACF)')
plt.show()

These plots help identify the appropriate model and its parameters.

Lecture 5.2: Time Series Models in

Statsmodels
Using ARIMA and SARIMAX Models for Time Series Forecasting

Introduction to Statsmodels 21
ARIMA (AutoRegressive Integrated Moving Average): Suitable for non-
seasonal time series. It has three components:

AR (p): Autoregressive terms (lags of the series).

I (d): Differencing to achieve stationarity.

MA (q): Moving average terms (lags of the forecast errors).

SARIMAX (Seasonal ARIMA with eXogenous variables): Extends ARIMA to

handle seasonal data and external variables.

Model Specification, Estimation, and Diagnostics

Step 1: Check for Stationarity

Use the ADF test to check if the series is stationary. If not, apply differencing.

from statsmodels.tsa.stattools import adfuller

# ADF test

result = adfuller(air_passengers['value'])

print(f'ADF Statistic: {result[0]}')

print(f'p-value: {result[1]}')

# If p > 0.05, difference the series

air_passengers['diff_value'] = air_passengers['value'].diff().dropna()

Step 2: Identify Model Orders Using ACF and PACF

AR(p): PACF cuts off after lag p.

MA(q): ACF cuts off after lag q.

For seasonal data, look for patterns at seasonal lags (e.g., every 12
months).

Step 3: Fit the Model

Introduction to Statsmodels 22
For ARIMA, specify the orders (p, d, q). For SARIMAX, include seasonal orders
(P, D, Q, s).

from statsmodels.tsa.arima.model import ARIMA

# Example: ARIMA(1,1,1)

arima_model = ARIMA(air_passengers['value'], order=(1,1,1)).fit()

print(arima_model.summary())

For seasonal data like AirPassengers, use SARIMAX with seasonal parameters:

from statsmodels.tsa.statespace.sarimax import SARIMAX

# Example: SARIMAX(1,1,1)(1,1,1,12)

sarimax_model = SARIMAX(air_passengers['value'], order=(1,1,1), seasonal_

order=(1,1,1,12)).fit()

print(sarimax_model.summary())

Step 4: Model Diagnostics

Check if residuals resemble white noise (no autocorrelation):

from statsmodels.graphics.tsaplots import plot_acf

# Residuals plot
residuals = sarimax_model.resid

plot_acf(residuals, lags=40)
plt.title('ACF of Residuals')
plt.show()

If the ACF plot shows no significant autocorrelation, the model is adequate.

Evaluating Model Performance Using Metrics
Common metrics for forecasting accuracy include:

Introduction to Statsmodels 23
Mean Absolute Error (MAE): Average absolute difference between
forecasts and actual values.

Mean Squared Error (MSE): Average squared difference, penalizing larger

errors more.

To evaluate, split the data into training and test sets:

# Split data (e.g., last 12 months as test)

train = air_passengers['value'][:-12]
test = air_passengers['value'][-12:]

# Fit model on training data

model = SARIMAX(train, order=(1,1,1), seasonal_order=(1,1,1,12)).fit()

# Forecast
forecast = model.forecast(steps=12)

# Calculate MAE and MSE

from sklearn.metrics import mean_absolute_error, mean_squared_error

mae = mean_absolute_error(test, forecast)

mse = mean_squared_error(test, forecast)

print(f'MAE: {mae:.2f}, MSE: {mse:.2f}')

Lower MAE and MSE indicate better forecasting performance.

Summary
This module covered:

Lecture 5.1: The fundamentals of time series data, including stationarity,

autocorrelation, and visualization techniques (time series plots, ACF, PACF).

Lecture 5.2: How to specify, estimate, and diagnose ARIMA and SARIMAX
models in statsmodels, and evaluate their performance using MAE and
MSE.

With these skills, you can analyze and forecast time series data effectively
using statsmodels!

Introduction to Statsmodels 24
Module 6: Advanced Topics and Case
Studies

Introduction
This module dives into advanced linear regression techniques and practical
applications of statsmodels. In Lecture 6.1, we cover methods to handle
complex data scenarios like unequal variances, correlated errors, and outliers.
In Lecture 6.2, we explore case studies from finance, economics, and social
sciences to illustrate how statsmodels solves real-world problems. The Boston
Housing dataset is used for the advanced techniques, while diverse datasets
highlight the case studies.

Lecture 6.1: Advanced Linear Regression

Techniques
This lecture introduces advanced methods to enhance linear regression,
addressing challenges like heteroscedasticity, correlated errors, and outliers.
We’ll use statsmodels for implementation.
Weighted Least Squares and Generalized Least Squares
Weighted Least Squares (WLS):
WLS adjusts for heteroscedasticity—when observation variances are unequal.
It assigns weights to observations, giving more influence to those with smaller
variances.

Purpose: Corrects for non-constant residual variance.

How it works: Minimizes the weighted sum of squared residuals, with

weights typically set as the inverse of variance.

Example: Using the Boston Housing dataset, suppose variance increases with
crime rate (crim). We weight observations inversely to crim.

import statsmodels.api as sm
import statsmodels.formula.api as smf

# Load Boston Housing dataset

Introduction to Statsmodels 25
boston = sm.datasets.get_rdataset('Boston', 'MASS').data

# Define weights (inverse of crim)

weights = 1 / boston['crim']

# Fit WLS model

wls_model = smf.wls('medv ~ crim + rm', data=boston, weights=weights).fi
t()

print(wls_model.summary())

Output: Coefficients reflect adjusted influence, improving reliability under

heteroscedasticity.

Generalized Least Squares (GLS):

GLS extends WLS by also accounting for correlations between observations,
common in time series or spatial data.

Purpose: Handles both heteroscedasticity and correlated errors.

How it works: Incorporates a covariance matrix to model error structure.

Example: For the Boston dataset, assume correlated errors among nearby
towns (simplified here with an identity covariance matrix).

# Add constant and define predictors

X = sm.add_constant(boston[['crim', 'rm']])

y = boston['medv']

# Define covariance matrix (identity for simplicity)

import numpy as np

cov_matrix = np.eye(len(boston))

# Fit GLS model

gls_model = sm.GLS(y, X, sigma=cov_matrix).fit()

print(gls_model.summary())

Introduction to Statsmodels 26
Note: Real applications require estimating the covariance matrix based on
data structure (e.g., autocorrelation).

Robust Regression and Outlier Detection

Robust Regression:
Robust regression reduces the impact of outliers, making it ideal when data
contains extreme values that could skew results.

Purpose: Provides stable estimates despite outliers.

How it works: Uses robust estimators (e.g., Huber’s T) to downweight

outliers.

Example: Apply robust regression to the Boston dataset.

from statsmodels.robust.robust_linear_model import RLM

# Fit robust model with Huber’s T

X = sm.add_constant(boston[['crim', 'rm']])

y = boston['medv']

robust_model = RLM(y, X, M=sm.robust.norms.HuberT()).fit()

print(robust_model.summary())

Output: Coefficients are less sensitive to outliers, offering a robust

alternative to OLS.

Outlier Detection:
Outlier detection identifies anomalous points that may distort models.
Techniques include residual analysis and influence measures like Cook’s
distance.
Example: Detect outliers in the Boston dataset using Cook’s distance.

# Fit OLS model

ols_model = sm.OLS(y, X).fit()

# Calculate Cook’s distance

Introduction to Statsmodels 27
influence = ols_model.get_influence()
cooks_d = influence.cooks_distance[0]

# Identify outliers (threshold: 4/n)

n = len(boston)
outliers = np.where(cooks_d > 4 / n)[0]

print(f'Potential outliers at indices: {outliers}')

Next Steps: Investigate outliers; remove if erroneous or use robust

regression.

Lecture 6.2: Case Studies and

Applications
This lecture showcases real-world applications of statsmodels across finance,
economics, and social sciences, with examples, best practices, and pitfalls.
Real-World Examples and Case Studies Using `statsmodels`
Case Study 1: Finance – Stock Price Prediction
Goal: Predict stock prices using historical data and external factors.

Dataset: Hypothetical stock data (e.g., replace with Yahoo Finance data).

Model: Time series regression or ARIMA.

Example:

import statsmodels.tsa.api as tsa

# Assume 'stock_data' has 'price' column

# Fit ARIMA(1,1,1) model
arima_model = tsa.ARIMA(stock_data['price'], order=(1,1,1)).fit()

print(arima_model.summary())

Insight: Captures trends and autocorrelation in stock prices.

Case Study 2: Economics – GDP Growth Analysis

Goal: Assess how interest rates and inflation affect GDP growth.

Introduction to Statsmodels 28
Dataset: Macroeconomic data (e.g., from public sources).

Model: Multiple linear regression.

Example:

# Assume 'econ_data' has 'gdp_growth', 'interest_rate', 'inflation'

econ_model = smf.ols('gdp_growth ~ interest_rate + inflation', data=econ_d
ata).fit()

print(econ_model.summary())

Insight: Quantifies economic relationships, assuming linearity.

Case Study 3: Social Sciences – Survey Data Analysis

Goal: Explore the link between education and income.

Dataset: Survey data with categorical ‘education’ levels.

Model: Regression with dummy variables.

Example:

# Assume 'survey_data' has 'income' and 'education'

survey_model = smf.ols('income ~ C(education)', data=survey_data).fit()

print(survey_model.summary())

Insight: Shows income differences across education levels.

Applying `statsmodels` to Various Domains

Finance: Time series models (ARIMA, GARCH) for stock or volatility

analysis.

Economics: Regression for policy impact or macroeconomic studies.

Social Sciences: Models with categorical variables for survey or behavioral

data.

Best Practices and Common Pitfalls

Best Practices

Data Preprocessing: Clean data, handle missing values, and encode

categoricals.

Introduction to Statsmodels 29
Model Selection: Validate assumptions (e.g., normality, homoscedasticity).

Interpretation: Contextualize results within the domain.

Common Pitfalls

Ignoring Assumptions: Leads to biased estimates (e.g., heteroscedasticity

in OLS).

Overfitting: Too many predictors without validation.

Misinterpretation: Confusing correlation with causation or misreading

coefficients.

Summary
Lecture 6.1: Covered WLS, GLS, robust regression, and outlier detection
using statsmodels, with examples from the Boston Housing dataset.

Lecture 6.2: Presented case studies in finance, economics, and social

sciences, demonstrating statsmodels applications, best practices, and
pitfalls.

This module equips you with advanced tools and practical knowledge to apply
statsmodels effectively in diverse, real-world scenarios.

Module 7: Putting it All Together

This final module integrates the concepts and techniques you’ve learned into a
cohesive framework. Lecture 7.1 focuses on developing a guided project using
statsmodels, while Lecture 7.2 covers presenting your work, reflecting on key
takeaways, and exploring resources for further growth.

Lecture 7.1: Project Development and

Implementation
In this lecture, you’ll develop a guided project using statsmodels, applying the
statistical modeling techniques from the course to a real-world problem. We’ll
also cover best practices for organizing and documenting your project.
Guided Project Development Using statsmodels

Introduction to Statsmodels 30
The guided project is your chance to apply what you’ve learned to a dataset
and problem of your choosing. Using statsmodels, you’ll perform statistical
analysis and modeling, following these steps:

1. Choose a Dataset
Select a dataset that interests you—either from previous modules (e.g.,
Boston Housing) or a public source like Kaggle. Ensure it aligns with the
problem you want to solve.

2. Define the Problem

State a clear research question or objective, such as predicting an outcome
(e.g., home prices) or understanding relationships between variables.

3. Explore the Data

Analyze the dataset using techniques like summary statistics, visualizations
(e.g., histograms, scatter plots), and checks for missing values or outliers.

4. Build and Evaluate Models

Use statsmodels to apply appropriate models:

Linear regression for continuous outcomes.

ARIMA or SARIMAX for time series data.

Advanced methods (e.g., robust regression) for complex scenarios.

Evaluate your model with metrics like R-squared, AIC, or mean squared
error (MSE).

5. Interpret the Results

Draw conclusions using statistical inference—interpret coefficients, p-
values, and confidence intervals, and validate model assumptions with
diagnostics (e.g., residual plots).

6. Document the Project

Create clear documentation, including code comments and a report with an
introduction, methodology, results, and discussion.

Example: Imagine using the Boston Housing dataset to predict median home
values. You’d explore variables like crime rate and room count, build a linear
regression model, evaluate its fit, and interpret how each factor affects prices.
Applying Concepts and Techniques Learned Throughout the Course

This project draws on the entire course:

Introduction to Statsmodels 31
Data Exploration: Techniques from Module 2 for cleaning and visualizing
data.

Modeling: Linear regression (Module 3), inference (Module 4), and time
series (Module 5).

Advanced Methods: Tools from Module 6 for handling special cases.

You’ll synthesize these skills to address your chosen problem effectively.

Best Practices for Project Organization and Documentation
To ensure your project is professional and reproducible:

Version Control: Use Git to track changes.

File Structure: Organize files into folders (e.g., data/, scripts/, docs/).

Reproducible Code: Use relative paths, list dependencies (e.g., in

requirements.txt), and add comments.

Documentation: Write a report covering your problem, methods, findings,

and insights, formatted in Markdown or similar.

Lecture 7.2: Final Project Presentations

and Course Wrap-up
This lecture prepares you to present your project, receive feedback, review the
course’s key lessons, and plan your next steps with additional resources.
Presenting Final Projects and Receiving Feedback
Your presentation is an opportunity to showcase your work. Structure it as
follows:

Introduction: Explain the problem and its relevance.

Methodology: Describe your approach and model choices.

Results: Share findings with visuals (e.g., plots) and stats (e.g., model
summaries).

Discussion: Highlight implications and limitations.

Receiving Feedback:

Listen carefully and ask clarifying questions.

Stay open to suggestions, even if critical.

Introduction to Statsmodels 32
Use feedback to refine your project.

Reviewing Key Concepts and Takeaways from the Course

Here’s a recap of the core ideas you’ve mastered:

Data Prep & Exploration: Cleaning and analyzing data sets the stage for
modeling.

Linear Regression: A foundational tool for prediction and inference.

Statistical Inference: Hypothesis testing and confidence intervals provide

rigor.

Time Series: ARIMA models enable forecasting.

Advanced Techniques: Methods like robust regression tackle complex

data.

These skills equip you to solve real-world problems with statistical rigor.
Resources for Further Learning and Professional Development
Continue your growth with these resources:

Books:

Applied Linear Regression by Sanford Weisberg

Time Series Analysis and Its Applications by Shumway and Stoffer

Online Courses:

Coursera’s “Data Science Specialization”

edX’s “Data Science MicroMasters”

Communities:

Stack Overflow for coding help

Reddit’s r/stats for discussions

Kaggle for datasets and collaboration

Conclusion
Module 7 ties together your learning journey. In Lecture 7.1, you’ll create a
project that demonstrates your skills with statsmodels, organized and
documented to a high standard. In Lecture 7.2, you’ll present your work, reflect

Introduction to Statsmodels 33
on the course, and gain resources to keep advancing. Congratulations on
reaching this point—you’re ready to apply statistical modeling to new
challenges!

Introduction to Statsmodels 34

R Lang-Unit-04
No ratings yet
R Lang-Unit-04
12 pages
Previewpdf
No ratings yet
Previewpdf
27 pages
Statistical Model - Wikipedia
No ratings yet
Statistical Model - Wikipedia
5 pages
Introduction To Statistical Modeling With SAS/STAT Software
No ratings yet
Introduction To Statistical Modeling With SAS/STAT Software
60 pages
Introduction To Statistical Modeling
No ratings yet
Introduction To Statistical Modeling
4 pages
Statistical Model
No ratings yet
Statistical Model
5 pages
1.reg Chapter1
No ratings yet
1.reg Chapter1
30 pages
Week1Lect1b CB
No ratings yet
Week1Lect1b CB
25 pages
Simple Explanation of Statsmodel Linear Regression Model Summary
No ratings yet
Simple Explanation of Statsmodel Linear Regression Model Summary
19 pages
Introduction To Regression With Statsmodels in Python
No ratings yet
Introduction To Regression With Statsmodels in Python
142 pages
Lesson 1: Introduction and Review of Concepts
No ratings yet
Lesson 1: Introduction and Review of Concepts
11 pages
Stat2 Textbook
100% (3)
Stat2 Textbook
1,656 pages
Lecture 16 Regression
No ratings yet
Lecture 16 Regression
30 pages
1 - Linear Models
No ratings yet
1 - Linear Models
22 pages
Chapter 08 Inference
No ratings yet
Chapter 08 Inference
34 pages
SM Notes 2020
No ratings yet
SM Notes 2020
139 pages
Bayesian Workshop
No ratings yet
Bayesian Workshop
54 pages
Data Science Inference and Modeling
No ratings yet
Data Science Inference and Modeling
98 pages
Data Science 03 - Regression PDF
No ratings yet
Data Science 03 - Regression PDF
32 pages
Statistical Modeling
No ratings yet
Statistical Modeling
8 pages
Regression Analysis Techniques
No ratings yet
Regression Analysis Techniques
16 pages
StatisticUsing R PDF
No ratings yet
StatisticUsing R PDF
35 pages
Análisis de Datos Categóricos
67% (3)
Análisis de Datos Categóricos
618 pages
QM2 23-24 Session 4
No ratings yet
QM2 23-24 Session 4
68 pages
BA Unit3
No ratings yet
BA Unit3
42 pages
Lecture 3
No ratings yet
Lecture 3
12 pages
Block 1 ST3189
No ratings yet
Block 1 ST3189
2 pages
Introduction To Statistical Modeling
No ratings yet
Introduction To Statistical Modeling
22 pages
An Introduction To Statistical Analysis
No ratings yet
An Introduction To Statistical Analysis
20 pages
Stata Slides
No ratings yet
Stata Slides
45 pages
Machine Learning Lab Word 12-1-2025. Document
No ratings yet
Machine Learning Lab Word 12-1-2025. Document
68 pages
An Introduction To Statistical Learning PDF
No ratings yet
An Introduction To Statistical Learning PDF
35 pages
Medical Students' Guide to Statistics
No ratings yet
Medical Students' Guide to Statistics
67 pages
Statistical Modelling Using Python
No ratings yet
Statistical Modelling Using Python
2 pages
MATH3091
No ratings yet
MATH3091
98 pages
Business Analytics
No ratings yet
Business Analytics
19 pages
Day35 Complete Guide To Statistics For ML-1
No ratings yet
Day35 Complete Guide To Statistics For ML-1
25 pages
CH 05
No ratings yet
CH 05
124 pages
DVA Lab Manual
No ratings yet
DVA Lab Manual
20 pages
Statistics, Statistical Modeling and Data Analytics
No ratings yet
Statistics, Statistical Modeling and Data Analytics
51 pages
STATA Logistic Regression Guide
No ratings yet
STATA Logistic Regression Guide
4 pages
Statistical Learning Guide
No ratings yet
Statistical Learning Guide
5 pages
Apuntes Estadistica
No ratings yet
Apuntes Estadistica
116 pages
Statistical Modeling and Inference With Python: Chester Ismay
No ratings yet
Statistical Modeling and Inference With Python: Chester Ismay
90 pages
Introduction To Regression and Analysis of Variance PDF
No ratings yet
Introduction To Regression and Analysis of Variance PDF
15 pages
(Ebook PDF) Stat2: Building Models For A World of Data PDF Download
100% (4)
(Ebook PDF) Stat2: Building Models For A World of Data PDF Download
55 pages
Linear Regression Lecture Notes
100% (2)
Linear Regression Lecture Notes
228 pages
(Ebook PDF) Stat2: Building Models For A World of Data PDF Download
No ratings yet
(Ebook PDF) Stat2: Building Models For A World of Data PDF Download
53 pages
What Is Empirical - Models
No ratings yet
What Is Empirical - Models
14 pages
Merge
No ratings yet
Merge
240 pages
Simple Linear Regression Guide
No ratings yet
Simple Linear Regression Guide
26 pages
Intro To Essential Stats With Python
No ratings yet
Intro To Essential Stats With Python
51 pages
MATH6183 Introduction+Regression
No ratings yet
MATH6183 Introduction+Regression
70 pages
Lecture Notes Week 3
No ratings yet
Lecture Notes Week 3
61 pages
An R Companion To Statistical Thinking For The 21st Century
No ratings yet
An R Companion To Statistical Thinking For The 21st Century
159 pages
Data Highlights Combined
No ratings yet
Data Highlights Combined
36 pages
R Intro 2011
No ratings yet
R Intro 2011
115 pages
AP Statistics Problems #18
0% (1)
AP Statistics Problems #18
3 pages
Chap 8 Mathscape
No ratings yet
Chap 8 Mathscape
53 pages
Instrumental Variables PDF
No ratings yet
Instrumental Variables PDF
11 pages
ITTC - Recommended Procedures: Resistance Uncertainty Analysis, Example For Resistance Test
No ratings yet
ITTC - Recommended Procedures: Resistance Uncertainty Analysis, Example For Resistance Test
17 pages
Harvard Ec 1123 Econometrics Problem Set 7 - Tarun Preet Singh
No ratings yet
Harvard Ec 1123 Econometrics Problem Set 7 - Tarun Preet Singh
3 pages
MSBVAR
No ratings yet
MSBVAR
92 pages
Probability and Statistics Exam
No ratings yet
Probability and Statistics Exam
51 pages
Thermodynamics Problem Set
No ratings yet
Thermodynamics Problem Set
5 pages
BGA Reballing: Impact on Shear Strength
No ratings yet
BGA Reballing: Impact on Shear Strength
42 pages
Georglm: Software For Generalised Linear Spatial Models Using R
No ratings yet
Georglm: Software For Generalised Linear Spatial Models Using R
46 pages
Data Management & Statistical Tools
No ratings yet
Data Management & Statistical Tools
28 pages
Case Study 5
100% (2)
Case Study 5
29 pages
Calculator User's Guide
No ratings yet
Calculator User's Guide
2 pages
Effect of Temperature On Thekopsora Minima Urediniospores and Uredinia
No ratings yet
Effect of Temperature On Thekopsora Minima Urediniospores and Uredinia
4 pages
Maximum Likelihood Estimation
No ratings yet
Maximum Likelihood Estimation
6 pages
1.3 Activity Questions 3 - Not Graded
No ratings yet
1.3 Activity Questions 3 - Not Graded
2 pages
Ch2 Fluid Properties
No ratings yet
Ch2 Fluid Properties
49 pages
Introduction To Statistics and Economies
0% (1)
Introduction To Statistics and Economies
9 pages
Arakawa 2004
No ratings yet
Arakawa 2004
33 pages
11411
No ratings yet
11411
8 pages
Continuous Random Variables Guide
No ratings yet
Continuous Random Variables Guide
91 pages
Carbon Dioxide at Record High Level
No ratings yet
Carbon Dioxide at Record High Level
1 page
Epsc 712. Lecture One Statistic & Research Methods in Education PDF
No ratings yet
Epsc 712. Lecture One Statistic & Research Methods in Education PDF
5 pages
Quality Tools for Problem Solving
No ratings yet
Quality Tools for Problem Solving
115 pages
International Journal of Forecasting: Michael Pedersen
No ratings yet
International Journal of Forecasting: Michael Pedersen
8 pages
Traffic Prediction - Using AI
No ratings yet
Traffic Prediction - Using AI
15 pages
(C) 2019 Application of Outlier Detection Using Re-Weighted Least Squares and R-Squared For IoT Extracted Data
No ratings yet
(C) 2019 Application of Outlier Detection Using Re-Weighted Least Squares and R-Squared For IoT Extracted Data
6 pages
Pavement Engineering in Developing Countries
100% (1)
Pavement Engineering in Developing Countries
35 pages
Pronto Pizza Problem Submission
100% (5)
Pronto Pizza Problem Submission
13 pages
Low-Flow Estimation and Prediction
100% (2)
Low-Flow Estimation and Prediction
138 pages