0% found this document useful (0 votes)

155 views48 pages

Geographically Weighted Regression Workshop

Geographically Weighted Regression (GWR) is a statistical technique that allows for spatial variations in relationships. This workshop presentation introduces GWR, covering its motivation, basic concepts, and implementation. Key points include: examining OLS residuals to identify nonstationarity; estimating separate regression coefficients for each observation based on spatial weights; advantages like improved model fit and reduced spatial autocorrelation; and guidance on running GWR models and assessing results.

Uploaded by

zibunans

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

155 views48 pages

Geographically Weighted Regression Workshop

Uploaded by

zibunans

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 48

Geographically Weighted Regression

CSDE Statistics Workshop Christopher S. Fowler PhD. February 1st 2011

Significant portions of this workshop were culled from presentations prepared by Fotheringham, Charleton and Brunsdon and presented at the 2010 Advanced Workshop on Spatial Analysis at the University of Santa Barbara.

University of Washington

Center for Studies in Demography and Ecology

Outline for the Session

The motivation for GWR

Examples from YOUR discipline

Mapping OLS Residuals

A good baseline for why we need GWR

GWR
Definitions, basic concepts

Running GWR
A straightforward implementation in ArcGIS

GWR and some extensions

Basics of OLS

y X
Assumes a stationary process Same stimulus provokes the same response anywhere in the study area

Why might relationships vary spatially?

Sampling variation Relationships intrinsically different across space (attitudes, preferences, contextual effects) Model misspecification

Applications: Ecology

GWR works on trees

Could have been differentiated sampling pattern creates predictable and changing levels of interaction among observations

Applications: Public Health

Relationships vary systematically

The relationship between mortality and occupational segregation and between mortality and unemployment varies across Tokyo

Applications: Sociology/Public Policy

Missing variables (and they may very well be unknowable)

The link between multifamily housing and residential burglaries varies widely even when controlling for numerous socioeconomic and neighborhood factors

Back upHow do we know if we have nonstationarity in our model?

Map residuals and test them for spatial autocorrelation if our model errs systematically with a spatial pattern then we may be on to something.

Poverty in the Southern U.S.

Our example Model

Poverty Fem aleH eadedH ousehold U nem ployed Black 65 andolder M etro AtLeastH ighSchoolEducation
Based on the work of Paul Voss and Katherine Curtis These are all understood to be good predictors of poverty What kinds of spatial structures influence this data set?

Lab Part 1

Run our OLS model in ArcGIS

Examine model output Map residuals

Calculate Morans I and Local Morans I

Our best aspatial model

So what now?
Add more missing variables and try again
Repeat the steps from the lab

Accept that there is something about certain places that makes them different (spatial heterogeneity)
Try GWR

Test variables meant to explore interactions taking place at short distances (spatial dependence)
Try Spatial Regression (Likely a spatial lag model)

Assume that the correlation is a nuisance and control for it in the error term
Try Spatial Regression (Likely a spatial error model)

Outline for Part II

What is GWR Weighting in GWR

Geographically Weighted Regression

Local statistical technique to analyze spatial variations in relationships We are not content with global averages of spatial data (climate for example) Why should we be satisfied with global averages in a statistical analysis?

Put another way.Simpsons Paradox

If we think of these points as our data grouped into colors by region we can see that the global and local models differ significantly

Source: Rcker and Schumacher BMC Medical Research Methodology 2008 8:34 doi:10.1186/1471-2288-8-34

Basic definitions
Spatial nonstationarity exists when the same stimulus provokes a different response in different parts of the study region Global models are statements about processes that are assumed to be stationary and, as such, are location GWR independent in greater detail Local models are spatial disaggregations of global models, the results of which are location specific Spatial heterogeneity refers to spatial patterns resulting from broad similarities usually over time Spatial dependence refers to spatial patterns that result from interactions among observations

Spatial Heterogeneity and Spatial Dependence

GWR and Spatial Processes

GWR is excellent at picking up broad scale regional differences

spatial heterogeneity

Not as effective at dealing with small scale interaction processes

Too much bias in each local model That doesnt mean it wont try (and give you misleading results)

GWR in a nutshell
Global model

y X

yi i i X i
Where i indicates that there is a set of coefficients estimated for every observation in our data set

becomes

The Key Difference

We estimate a set of regression coefficients for each observation

To do so we weight near observations more heavily than more distant ones. We may also estimate coefficients based on some local subset of observations

Some advantages of GWR

Excellent tool for testing model specification

Where does model fit look good, where are you missing something?

Residuals generally lower and not spatially autocorrelated

Real values for

.9 .8 .7 .6 .5 .8 .7 .6 .5 .4 .8 .6 .5 .4 .3 .7 .5 .4 .3 .2 .5 .4 .4 .2 .1

Estimated Values of in global model

.5 .5 .5 .5 .5

Residuals from global model

+ + + + 0 + + + 0 + + 0 + 0 0 -

Reasons to use GWR

Identify model misspecification Identify nonstationarity in relationships

Improved model fit (R2, AIC, etc) Reduced spatial autocorrelation Represent context

Address spatial heterogeneity when precise variables may not exist

Youve convinced me, what next?

Run your aspatial model (as we did in 1st lab)

We will want the results and diagnostics to compare with what comes next.

Decide how you are going to weight your nearby locations

Fixed bandwidth Variable bandwidth User-defined bandwidth

It all comes down to how you weight the observations

We can use a fixed bandwidth h

Wij = exp[-((dij/h)2)/2]

Number of observations will vary, but area they represent will remain constant

Weighting option 2

Or we can employ an adaptive bandwidth

Wij = [1-(dij2/ h2)] 2 if j is one of is N nearest neighbors

Number of observations will remain fixed, but area will not be the same

Kernels and Weights

Bandwidth specifies shape of weights curve Kernel type tells us whether we will define our bandwidth based on distance (fixed) or number of neighbors (adaptive)

So how do we know what bandwidth to use?

Judging the appropriate bandwidth

A tradeoff between Bias: we include observations that are not part of the same spatial group and Variance: we dont have enough points in our model to say anything with conviction

AIC Variance

Optimum Bias

AICc or CV measure model fit Optimize fit to obtain best bandwidth.

Bandwidth

To sum

Weighting assumptions are very important to outcomes in GWR Fixed distance kernel is more appropriate when the distribution of your observations is relatively stable across space (e.g. size, number of neighbors). Adaptive kernel is appropriate when distribution varies across space (e.g. events are clustered or polygons are heterogeneous) Once a kernel type is selected optimization takes some of the guesswork out of it, but robustness checks are still needed

Residuals from the OLS model from last lesson

Looks reasonably good

Morans I is still .22 and highly significant

Lab
Run GWR model Check Residuals Check variation in coefficients

Further topics/issues in GWR

Where to go for next steps General troubleshooting Significance testing Outlier problems Poisson and Logistic model implementations Mixed form models

Other software implementations of GWR

GWR 3.x (4.0 should be out soon) R (spgwr package) Stata Matlab Perhaps others I havent heard of

General Troubleshooting

Regional dummies BAD

Eliminate them from modelwe are trying to show regional variation, not control for it

Binary and low probability count variables

Use caution, lack of variation may cause model to crash or have trouble finding a workable bandwidth

Significance Testing

How do I know if the variation I see in my coefficients is meaningful? Could do t-test, but you will run into problems with multiple (1,387) tests
Results in lots of false positives Standard correction (Bonferroni) will make any significance finding nearly impossible

Best Method: Monte Carlo simulation

Randomly reassign all observation values (dependent and independent variables travel together) to different observation locations
Each countys data gets assigned randomly to a different county

Re-run GWR and record coefficients Repeat lots of times (at least 100) Define a distribution for coefficient values and compare your coefficients to this distribution

Other method: Fotheringham Significance Test

F otheringham

1 pe pe np

pe is effective number of parameters p is the number of parameters

Fotheringham Significance Test

F otheringham

1 pe pe np
.05 1 (37.97 ) 37.97 1387 8 .001283

Type equation here.

F otheringham

In Excel we can find the significant T-statistic using: TINV(.001283,1379) In R we use: qt(1-(.001283/2),1379) Either way we get a value of ~3.23

Results: Significant Nonstationarity for Percent Hispanic

Outlier problems

Outliers cause problems for everybody, but their impact is greater for local regressions, particularly when bandwidth keeps number of observations low. In standard OLS
Run model and identify observations with high or low residuals (~ +/- 4) Weight these observations less than 1 Re-run until none of the observations have extreme residuals Now do your GWR with weights assigned

Poisson and Logistic model forms

Implementations exist in both R and GWR 3.x software Both require much greater care with respect to colinearity and lack of variation

Mixed-form models

What if some of your variables are stationary and others have variation?

Mixed-form models allow you to hold some coefficients constant while allowing others to vary
Not yet implemented in any statistical package, but not that difficult from a technical standpoint

Concluding comments

What comes next?

Spatial regression Multilevel models

Spatial Econometrics With R 2020
No ratings yet
Spatial Econometrics With R 2020
141 pages
Reshape Data Using Rstudio: Oscar Torres-Reyna
No ratings yet
Reshape Data Using Rstudio: Oscar Torres-Reyna
5 pages
Ismaykim1 PDF
No ratings yet
Ismaykim1 PDF
522 pages
x12 Arima Reference Manual
No ratings yet
x12 Arima Reference Manual
257 pages
An Introduction To Generalized Linear Models 3rd Edition Annette J. Dobson Updated 2025
100% (2)
An Introduction To Generalized Linear Models 3rd Edition Annette J. Dobson Updated 2025
108 pages
BADM 572 Module 4 Study Session 7 April 2019
No ratings yet
BADM 572 Module 4 Study Session 7 April 2019
44 pages
Reml Guide
No ratings yet
Reml Guide
93 pages
Chi G, ZHU
100% (1)
Chi G, ZHU
26 pages
Stats216 hw2
No ratings yet
Stats216 hw2
21 pages
Mathematical Economics Guide
No ratings yet
Mathematical Economics Guide
394 pages
The Geometry of Partial Least Squares
No ratings yet
The Geometry of Partial Least Squares
28 pages
Non-Ergodicity and Its Implications For Businesses and Investors PDF
No ratings yet
Non-Ergodicity and Its Implications For Businesses and Investors PDF
72 pages
Sampling Theory and Method-301-500
No ratings yet
Sampling Theory and Method-301-500
200 pages
ANOVA Mean Separation & Error Rates
No ratings yet
ANOVA Mean Separation & Error Rates
16 pages
DMwR: Data Mining with R Package
No ratings yet
DMwR: Data Mining with R Package
102 pages
GG Map Cheat Sheet
No ratings yet
GG Map Cheat Sheet
2 pages
Linear Regression Using R
No ratings yet
Linear Regression Using R
24 pages
Biogeography: An Integrative Approach of The Evolution of Living Eric Guilbert PDF Download
No ratings yet
Biogeography: An Integrative Approach of The Evolution of Living Eric Guilbert PDF Download
152 pages
8623040
No ratings yet
8623040
50 pages
Beyond Prediction Using Big Data For Policy Problems
No ratings yet
Beyond Prediction Using Big Data For Policy Problems
4 pages
HW 03 Sol
No ratings yet
HW 03 Sol
9 pages
When Should You Adjust Standard Errors For Clustering?: Alberto Abadie, Susan Athey, Guido Imbens, & Jeffrey Wooldridge
No ratings yet
When Should You Adjust Standard Errors For Clustering?: Alberto Abadie, Susan Athey, Guido Imbens, & Jeffrey Wooldridge
33 pages
Quantile vs. Linear Regression Analysis
No ratings yet
Quantile vs. Linear Regression Analysis
11 pages
Count Data Models in SAS
No ratings yet
Count Data Models in SAS
12 pages
Likelihood Ratio Tests
No ratings yet
Likelihood Ratio Tests
7 pages
Mathematical Statistics Basic Ideas and Selected Topics Volume I Second Edition Bickel Peter J. Instant Access 2025
100% (2)
Mathematical Statistics Basic Ideas and Selected Topics Volume I Second Edition Bickel Peter J. Instant Access 2025
107 pages
Mangiafico, S.S. 2016. RHandbookProgramEvaluation
No ratings yet
Mangiafico, S.S. 2016. RHandbookProgramEvaluation
659 pages
Assignment 5
No ratings yet
Assignment 5
15 pages
Stata Treatment-Effects Reference Manual:: Release 16
No ratings yet
Stata Treatment-Effects Reference Manual:: Release 16
325 pages
07 - Natural Experiment (Part 2) PDF
No ratings yet
07 - Natural Experiment (Part 2) PDF
90 pages
Chapter 1. Introduction and Theoretical Issues in Archaeological Gis. Chapter 2.
100% (1)
Chapter 1. Introduction and Theoretical Issues in Archaeological Gis. Chapter 2.
22 pages
Stats Learning Practice Solutions
100% (1)
Stats Learning Practice Solutions
3 pages
Midterm
No ratings yet
Midterm
5 pages
(Ebook) A Gentle Introduction To Stata by Alan C. Acock ISBN 9781597182690, 1597182699 Newest Edition 2025
No ratings yet
(Ebook) A Gentle Introduction To Stata by Alan C. Acock ISBN 9781597182690, 1597182699 Newest Edition 2025
77 pages
Optimization
No ratings yet
Optimization
23 pages
ECON2206 Assignment 2 William Chau z3376203
No ratings yet
ECON2206 Assignment 2 William Chau z3376203
5 pages
Foundational Papers of Risk
No ratings yet
Foundational Papers of Risk
8 pages
Solution CH # 5
No ratings yet
Solution CH # 5
39 pages
Mendelian Randomization Methods For Causal Inference Using Genetic Variants 2nd Edition Stephen Burgess 2025 Easy Download
No ratings yet
Mendelian Randomization Methods For Causal Inference Using Genetic Variants 2nd Edition Stephen Burgess 2025 Easy Download
173 pages
Ipc2022 - 87054 Probabilistic Flaw Growth Rate Estimates Using Multiple Inline
No ratings yet
Ipc2022 - 87054 Probabilistic Flaw Growth Rate Estimates Using Multiple Inline
10 pages
Kondo 2016 Hot and Cold Spot Analysis Using Stata
No ratings yet
Kondo 2016 Hot and Cold Spot Analysis Using Stata
19 pages
Chapter 3
No ratings yet
Chapter 3
26 pages
AP Statistics Semester 1 Final Study Guide KEY
No ratings yet
AP Statistics Semester 1 Final Study Guide KEY
27 pages
UNIGIS StudyGuide MSC - en
No ratings yet
UNIGIS StudyGuide MSC - en
22 pages
(Image Processing Series) Luciano Da Fona Costa, Roberto Marcond Cesar Jr. - Shape Classification and Analysis - Theory and Practice-CRC Press (2009) PDF
No ratings yet
(Image Processing Series) Luciano Da Fona Costa, Roberto Marcond Cesar Jr. - Shape Classification and Analysis - Theory and Practice-CRC Press (2009) PDF
674 pages
(Ebook PDF) Scientific American Environmental Science For A Changing World 3rd Edition Instant Download
83% (6)
(Ebook PDF) Scientific American Environmental Science For A Changing World 3rd Edition Instant Download
59 pages
Edgar1990-The Use of The Size-Structure of Benthic Macrofaunal Communities To Estimate Faunal Biomass
No ratings yet
Edgar1990-The Use of The Size-Structure of Benthic Macrofaunal Communities To Estimate Faunal Biomass
20 pages
Construction Cost Management Insights
No ratings yet
Construction Cost Management Insights
3 pages
STAT 650 - Foundations of Data Science Syllabus
No ratings yet
STAT 650 - Foundations of Data Science Syllabus
13 pages
(Ebook PDF) Introduction To Econometrics, 4th Global Edition Instant Download
100% (6)
(Ebook PDF) Introduction To Econometrics, 4th Global Edition Instant Download
57 pages
Bishop ML
No ratings yet
Bishop ML
3 pages
Solution Manual For Interactive Statistics (Classic Version), 3rd Edition, Martha Aliaga Instant Download
100% (8)
Solution Manual For Interactive Statistics (Classic Version), 3rd Edition, Martha Aliaga Instant Download
55 pages
Goldfeld Quandt Test
No ratings yet
Goldfeld Quandt Test
10 pages
Transport Logit Models Analysis
No ratings yet
Transport Logit Models Analysis
20 pages
Propagation of Data Uncertainty in Surface Wave Inversion
No ratings yet
Propagation of Data Uncertainty in Surface Wave Inversion
10 pages
GIS320 Lecture4 Geographically Weighted Regression
No ratings yet
GIS320 Lecture4 Geographically Weighted Regression
19 pages
1D Fotheringham
No ratings yet
1D Fotheringham
101 pages
Some Notes On Parametric Significance Test For GWR
No ratings yet
Some Notes On Parametric Significance Test For GWR
28 pages
Geographical Analysis - 2022 - Comber - A Route Map For Successful Applications of Geographically Weighted Regression
No ratings yet
Geographical Analysis - 2022 - Comber - A Route Map For Successful Applications of Geographically Weighted Regression
24 pages
V47i01 PDF
No ratings yet
V47i01 PDF
38 pages
Rstudio Shorctuts (Windows) : (Shift+Alt+K Displays All Shortcuts)
No ratings yet
Rstudio Shorctuts (Windows) : (Shift+Alt+K Displays All Shortcuts)
1 page
Dejong 2012
No ratings yet
Dejong 2012
14 pages
Forest Ecology and Management
No ratings yet
Forest Ecology and Management
11 pages
Blujdea 2
No ratings yet
Blujdea 2
1 page
R Markdown: Cheat Sheet
No ratings yet
R Markdown: Cheat Sheet
2 pages
Business Combination Test Bank Part 2
100% (1)
Business Combination Test Bank Part 2
21 pages
Opel 2001
100% (3)
Opel 2001
164 pages
Nursing Care for Burn and Cancer Patients
100% (9)
Nursing Care for Burn and Cancer Patients
15 pages
2017 CFA Level 2 Mock Exam Morning - Ans
No ratings yet
2017 CFA Level 2 Mock Exam Morning - Ans
62 pages
Phonic AM Series Compact Mixers Manual
100% (1)
Phonic AM Series Compact Mixers Manual
28 pages
Examining The Importance of STEM Education in Enhancing Student Outcomes From The Perspective of ACLC Teachers
No ratings yet
Examining The Importance of STEM Education in Enhancing Student Outcomes From The Perspective of ACLC Teachers
33 pages
Cash Balance Calculations for 2014
100% (1)
Cash Balance Calculations for 2014
4 pages
222-Article Text-558-2-10-20210217
No ratings yet
222-Article Text-558-2-10-20210217
8 pages
Electrical Project
No ratings yet
Electrical Project
3 pages
LED TV Manufacturers in Sonipat - TMB Electronics LED TV
No ratings yet
LED TV Manufacturers in Sonipat - TMB Electronics LED TV
22 pages
assignmentASM ALL 271547 PDF
No ratings yet
assignmentASM ALL 271547 PDF
11 pages
Digest City of Manila V Iac
100% (1)
Digest City of Manila V Iac
3 pages
13
0% (1)
13
43 pages
Curriculum - Nsdl-Depository Operations Module: Overview of The Capital Market
No ratings yet
Curriculum - Nsdl-Depository Operations Module: Overview of The Capital Market
3 pages
Frontpage Skills
No ratings yet
Frontpage Skills
6 pages
Generic Roadmap For The Counties of Kenya V1.1
No ratings yet
Generic Roadmap For The Counties of Kenya V1.1
33 pages
Project Proposal On Online Hotel Management System: University Name
No ratings yet
Project Proposal On Online Hotel Management System: University Name
9 pages
A Project Report: in Partial Fulfillment For The Award of The Degree
No ratings yet
A Project Report: in Partial Fulfillment For The Award of The Degree
50 pages
Geriatric Health Care and Roles
No ratings yet
Geriatric Health Care and Roles
14 pages
Let Socsci 2017
No ratings yet
Let Socsci 2017
57 pages
Assembly Line Balancing Guide
No ratings yet
Assembly Line Balancing Guide
8 pages
Test Upload
No ratings yet
Test Upload
9 pages
Stainless Steel Metric Bolts, Screws, and Studs: Standard Specification For
No ratings yet
Stainless Steel Metric Bolts, Screws, and Studs: Standard Specification For
9 pages
SPE/IADC-184611-MS Improving Torque and Drag Prediction Using The Advanced Spline Curves Borehole Trajectory
No ratings yet
SPE/IADC-184611-MS Improving Torque and Drag Prediction Using The Advanced Spline Curves Borehole Trajectory
29 pages
HAZMAT Loading Guide for Mariners
No ratings yet
HAZMAT Loading Guide for Mariners
8 pages
Legal Precedents in Transport Liability
No ratings yet
Legal Precedents in Transport Liability
33 pages
Partnership Dissolution Exercises
No ratings yet
Partnership Dissolution Exercises
1 page
Lab 9
No ratings yet
Lab 9
5 pages
Assignment - Cloud Computing
No ratings yet
Assignment - Cloud Computing
26 pages
Management and Information Technology After Digital Transformation 1st Edition by Peter Ekman, Peter Dahlin, Christina Keller 9781000451665 1000451666 Instant Download
100% (3)
Management and Information Technology After Digital Transformation 1st Edition by Peter Ekman, Peter Dahlin, Christina Keller 9781000451665 1000451666 Instant Download
75 pages

Geographically Weighted Regression Workshop

Uploaded by

Geographically Weighted Regression Workshop

Uploaded by

Geographically Weighted Regression

CSDE Statistics Workshop Christopher S. Fowler PhD. February 1st 2011

Center for Studies in Demography and Ecology

Outline for the Session

The motivation for GWR

Mapping OLS Residuals

GWR and some extensions

Why might relationships vary spatially?

GWR works on trees

Applications: Public Health

Applications: Sociology/Public Policy

Back upHow do we know if we have nonstationarity in our model?

Poverty in the Southern U.S.

Our example Model

Run our OLS model in ArcGIS

Calculate Morans I and Local Morans I

Our best aspatial model

Outline for Part II

Geographically Weighted Regression

Put another way.Simpsons Paradox

Spatial Heterogeneity and Spatial Dependence

GWR and Spatial Processes

GWR is excellent at picking up broad scale regional differences

Not as effective at dealing with small scale interaction processes

The Key Difference

We estimate a set of regression coefficients for each observation

Some advantages of GWR

Excellent tool for testing model specification

Residuals generally lower and not spatially autocorrelated

Real values for

Estimated Values of in global model

Residuals from global model

Reasons to use GWR

Address spatial heterogeneity when precise variables may not exist

Youve convinced me, what next?

Run your aspatial model (as we did in 1st lab)

Decide how you are going to weight your nearby locations

It all comes down to how you weight the observations

We can use a fixed bandwidth h

Or we can employ an adaptive bandwidth

Wij = [1-(dij2/ h2)] 2 if j is one of is N nearest neighbors

Kernels and Weights

So how do we know what bandwidth to use?

Judging the appropriate bandwidth

AICc or CV measure model fit Optimize fit to obtain best bandwidth.

Residuals from the OLS model from last lesson

Looks reasonably good

Morans I is still .22 and highly significant

Further topics/issues in GWR

Other software implementations of GWR

Regional dummies BAD

Binary and low probability count variables

Best Method: Monte Carlo simulation

Other method: Fotheringham Significance Test

pe is effective number of parameters p is the number of parameters

Fotheringham Significance Test

Type equation here.

Results: Significant Nonstationarity for Percent Hispanic

Poisson and Logistic model forms

What comes next?

You might also like