Skip to content

cybergis/VNE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Vulnerable Neighborhood Explorer (VNE)

Vulnerable Neighborhood Explorer (VNE) is an open-source visual analytics tool for exploring social vulnerability across different neighborhoods

VNE is a cyberGIS-based visual analytics tool that allows users to (1) delineate neighborhoods based on their selection of variables describing socioeconomic and demographic profiles and (2) explore which neighborhoods are susceptible to the impacts of disasters based on specific socioeconomic and demographic characteristics. Firefox or Google Chrome is the recommended web browser for reaping the best performance of VNE.

Table of Contents

Getting Started

**You can run VNE in your Jupyter Notebook/Lab installed on your PC as well as in CybearGISX. We recommend that you use CyberGISX because all the required packages have been integrated in CyberGISX.**

To use it in CyberGISX, follow steps below:

  1. If you do not have a CyerGISX account, create a CyberGISX account with your institution's email address or Google email address at https://cybergisxhub.cigi.illinois.edu
  2. Once you log in CyberGISX, go to https://cybergisxhub.cigi.illinois.edu/notebook/vulnerable-neighborhood-explorer, and click the button "Open with CyberGISX". Wait for 3 to 5 seconds. All related source codes will be automatically copied, and the main notebook will be opened. Then, click the “play button” at the top to run.

A video tutorial on using VNE in CyberGISX

Note: The video is currently muted by default. Please unmute the video before playing to hear the audio.

VNE_demo.mp4

Input Parameter Description

Parameter Description
‘title’ Enter a descriptive title for the visualization. Texts will be placed at the top of the result visualization.
‘subject’ Specify the subject matter (e.g., COVID-19). Texts will be placed at the top of the maps, column bar charts, and box plots.
‘inputCSV’ Provide the path to your input CSV file. The file should include socioeconomic, demographic, and health status data. Ensure that the first column is labeled geoid and the second column is labeled year. All subsequent columns will be available for selection in the Variables input box below.
‘shapefile’ Enter the path to the shapefile. A shapefile is used to visualize polygons on the map. The first column header must start with geoid, and the code should match the geoid column of another input CSV file that you enter for inputCSV and disasterInputCSV.
‘disasterInputCSV’ Enter the path to your input CSV file containing data representing the number of disaster-affected people. In the case of COVID-19, the file can contain the number of confirmed cases, COVID-19 testing cases, and deaths. Ensure that the first column is labeled geoid.
‘rate1’ Primarily used for disease data. Computes the percentage using two variables in disasterInputCSV. For example, if the CSV file contains columns like total_count (number of confirmed cases) and total_test (number of individuals tested), you can define Confirmed (%) = total_count / total_test * 100. This will compute and display total_Confirmed (%) in the result visualization. The column names in the input CSV file assigned to disasterInputCSV must include underscore (_). When you define it like above case, the program identifies variables ending with _count and divides them by variables ending with _test for the calculation. Similarly, rate2 can be used as Death (%) = total_deaths / total_cases * 100 to compute and visualize the fatality rate.
‘normalizationCSV’ Enter the path to your input CSV file. The first column should contain column headers from disasterInputCSV, and the second column should contain column headers from inputCSV. For example, entering total_count in the first column and Population in the second column will compute total_count / Population * normalizationUnit.
‘normalizationUnit’ Set the normalization value (e.g., 10,000). For instance, if you set this to 10,000 and use total_count / Population from normalizationCSV, it will compute total_count / Population * 10,000. The result will be visualized as total_count (/10K pop) on the first map, column bar charts, and box plots.
‘years’ List the years for the analysis. This value should match the second column of the input CSV. It will be displayed at the top of the neighborhood map.
‘method’ Clustering algorithms used to identify neighborhood types. Enter one of the following methods: kmeans, ward, affinity_propagation, spectral, gaussian_mixture. For detailed information about each method, refer to the Neighborhood Clustering Methods section in the Analyze Module of the Geospatial Neighborhood Analysis Package (GEOSNAP).
‘nClusters’ Specify the number of clusters.
‘variables’ Select variables to be computed from inputCSV. Below is the full description of eighteen variables used in the example:
  • Median monthly housing costs: Median monthly housing costs.
  • % below poverty: Percentage of the population in poverty.
  • % unemployed: Percentage of the unemployed population.
  • % with 4-year college degree: Percentage of the population with at least a four-year college degree.
  • % manufacturing: Percentage of manufacturing employees (by industries).
  • % service industry: Percentage of service employees (by industries).
  • % structures more than 30 years old: Percentage of structures built more than 30 years ago.
  • % households moved <10 years ago: Percentage of household heads who moved into the unit less than 10 years ago.
  • % multiunit structures: Percentage of housing units in multi-unit structures.
  • % owner-occupied housing: Percentage of owner-occupied housing units.
  • % vacant housing: Percentage of vacant housing units.
  • % > 60 years old: Percentage of the population aged 60 years and over.
  • % < 18 years old: Percentage of the population aged 17 years and under.
  • % white: Percentage of persons of white race, not Hispanic origin.
  • % Asian: Percentage of persons of Asian race (and Pacific Islander).
  • % Hispanic: Percentage of persons of Hispanic origin.
  • % black: Percentage of persons of black race, not Hispanic origin.
  • % foreign: Percentage of the foreign-born population.
‘Distribution_of_Subject’ A1 in Fig1. Enter True to display or False not to display.
‘Zscore_Means_across_Clusters’ A in Fig2. Enter True to display or False not to display.
‘Zscore_Means_of_Each_Cluster’ B in Fig2. Enter True to display or False not to display.
‘Number_of_Column_Charts_for_Subject_Clusters’ The number of column bar charts displayed in Fig1.
‘Number_of_BoxPlots_for_Subject_Clusters’ The number of box plots displayed in Fig1.

Example Result Visualization

Example visaulizations are available in the folders below:

  • VNE_Chicago
  • VNE_Chicago2
  • VNE_Chicago_kmeans_C5
  • VNE_Chicago_kmeans_C5
  • New_York_kmeans_C5
  • Phoenix_kmeans_C5
  • Miami_kmeans_C5
  • US_kmeans_C5
  • Chicago_extended_kmeans_C5
  • Illinois_kmeans_C5

A Case Study

Exploring Neighborhood-level Social Vulnerability to COVID-19 in Chicago.

Two images (Fig 1 and 2) below show the result visualization of VNE, which allow users to explore socioeconomic and demographic disparity in COVID-19 outbreaks as well as vulnerable neighborhoods and their socioeconomic and demographic characteristics. Two input data were used:

  • COVID-19 confirmed and test cases at the zip code level in Chicago. They were downloaded from the website of the Illinois Department of Public Health. The data used reflects the duration of the COVID-19 outbreak until July 11th, 2020, when the data was downloaded.
  • American Community Survey (ACS) 5-year estimates from 2014 to 2018. From ACS, 18 variables representing different socioeconomic and demographic statuses were collected.

Below is a detailed description of each chart and map that can be visualized using the Vulnerable Neighborhood Explorer (VNE).

  • In Fig1B, users can select disaster-related data via a dropdown menu, choosing between total infection rates or rates categorized by race/ethnicity (e.g., White, Hispanic) at Zip Codes outlined in white. Fig1A displays the density distribution of the selected data, providing geographic context

  • Fig1C presents neighborhood delineations obtained through k-means clustering of the eighteen American Community Survey variables at the census Zip Code level (as labeled on the x-axis in Fig 2A). Each neighborhood typically comprises multiple Zip Codes.

  • Fig1D shows the proportion of Zip Codes in each cluster, serving as a legend. Clusters include "White Rich Owner," "White Hispanic Aging Suburban," "Asian Elite Renter," "Hispanic Laborer," and "Black Poor." Users can name each cluster by analyzing the Z-score means of the variables using the heatmap and bar charts in Fig 2A and 2B. This can be updated in Config.js by clicking the third link generated by VNE. For labeling instructions, see [Han et al., 2023].

  • Fig1 E-G display column charts of the total infection rate and the infection rates among White and Hispanic populations within each neighborhood, respectively.

  • Fig1 H-CJ complement these with box plots showing the distribution of these rates across the five clusters.

  • Fig 2A and 2B illustrate the mean z-scores of each input variable, highlighting the distinguishing characteristics of the clusters in Fig1C. Z-scores indicate how far each variable's value is from the mean.

    • Fig 2A compares the mean z-scores of the eighteen variables across the five neighborhoods.
    • Fig 2B shows the sorted mean z-score for a selected neighborhood which can be chosen using the dropdown box at the top of this chart. In this case, five horizontal bar charts representing the mean z-score of each of the eighteen variables can be created on the interface of the interactive map.

Visualization of VNE - Part 1

Figure 1: The initial part of the Vulnerable Neighborhood Explorer (VNE) visualization.



Visualization of VNE - Part 2

Figure 2: The subsequent part of the Vulnerable Neighborhood Explorer (VNE) visualization.


Interactive Features of the Visual Interface

VNE is a geovisual analytics tool that utilizes a Coordinated Multiple Views (CMV) approach, allowing users to interact with several interconnected visual representations simultaneously. This enhances the analysis of complex data relationships and facilitates the extraction of meaningful insights. Here is a video showcasing the visual features of the Vulnerable Neighborhood Explorer (VNE).

Note: The video is currently muted by default. Please unmute the video before playing to hear the audio.

VNE_Visual_Features.mp4

To jump to specific functionalities, click the links below:


A key feature of CMV is cross-filtering, where actions in one view—such as selecting a region in a chart—automatically update related views. For instance, consider three visualizations: a map showing COVID-19 infection rates by Zip Code (Fig 3A), a map displaying different neighborhood types (Fig 3B), and a proportion chart of Zip Code areas per neighborhood (Fig 3C). When a user hovers over a neighborhood category in the proportion chart, both maps update to reflect the selected area. Hovering over "C0 White Rich Owner" highlights the data in this neighborhood on both maps, showing a generally low infection rate (Fig 3D, E and F). Conversely, hovering over "C4 Black Poor" highlights that neighborhood, indicating a higher infection rate (Fig 3G, H and I). This interactive cross-filtering enables users to compare infection rates across neighborhoods, focus on specific areas to identify localized trends, and understand how different neighborhoods relate to infection rates. In summary, VNE's CMV and cross-filtering capabilities allow for dynamic exploration and analysis of disaster-related data, such as COVID-19 infection rates, across different neighborhoods.

Visualization of VNE - Part 1

Figure 3: Partial visualization of the VNE output. Panels (D) and (E) show the map view when the C0: White Rich Owner neighborhood is selected in the proportion chart (F). Panels (G) and (H) show the map view when the C4: Black Poor neighborhood is selected in the proportion chart (I).

Another key feature of VNE is brushing, where highlighting a specific data point in one visualization, simultaneously highlighting the corresponding regions across all linked maps. On the right in Fig 4, the box plot depicts the distribution of COVID-19 infection rates across five neighborhoods, with each dot representing the infection rate of a specific Zip Code area. Users can lasso-select these dots (Zip Codes) on the chart, triggering the corresponding Zip Code areas to be highlighted on the two maps. In this example, the box plot focuses on the Hispanic laborer neighborhood, revealing a higher distribution of infection rates compared to other neighborhoods. By selecting areas above the 75th percentile within this neighborhood, the highlighted Zip Code areas represent the top 25% of infection rates in the Hispanic laborer neighborhood.

Visualization of VNE - Part 1

Figure 4: The result visualization is created by setting only the box plots to 1 and leaving the other chart options unselected.

VNE integrates synchronized map views with cross-filtering capabilities to enhance data exploration. When a user pans or zooms one map, all linked maps automatically adjust to display the same geographic area. This synchronization enables simultaneous analysis of COVID-19 infection rates and neighborhood types within focused regions. Additionally, zooming into a sub-region on the maps triggers the distribution chart to dynamically display the infection rate distribution within the current map boundaries. Furthermore, the chart illustrated in Fig 5D updates in real-time to show the proportion of polygons within each neighborhood. Fig 5 presents maps of a region that has been zoomed in from Fig1 A-D. Please note that the distribution charts (Fig 5A) and the proportion chart (Fig 5D ) appear differently in these figures, reflecting the views before and after zooming. Specifically, C2: Asian Elite Rent is not displayed in Fig 5D because the zoomed-in map in Fig 5C no longer includes this neighborhood once zoomed from the view shown in Fig 1C.

Visualization of VNE - Part 1

Figure 5: Partial visualization of the VNE output. (A) illustrates the distribution of the variable selected in (B), where (B) represents the COVID-19 infection rate. (C) and (D) depict neighborhood types. These images are screen captures taken from the visualization featured in the accompanying video.

Data

Related Resources

Contributors

Su Yeon Han1, Joon-Seok Kim2, Jooyoung Yoo3, Jeon-Young Kang4, Alexander Michels5, Fangzheng Lyu5, Furqan Baig5, Jinwoo Park5, Shaowen Wang5

1 Geography and Environmental Studies, Texas State University, San Marcos, TX, USA
2 Computer Science, Emory University
3Spatial Sciences Institute, University of Southern California
4 Department of Geography, Kyung Hee University, South Korea
5CyberGIS Center for Advanced Digital and Spatial Studies, University of Illinois at Urbana-Champaign, Urbana, IL, USA

References

Han, S. Y., Kang, J. Y., Lyu, F., Baig, F., Park, J., Smilovsky, D., & Wang, S. (2023). A cyberGIS approach to exploring neighborhood‐level social vulnerability for disaster risk management. Transactions in GIS, 27(7), 1942-1958.

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.


If you have questions, please contact Dr. Su Yeon Han at su.han@txstate.edu at Geography and Environmental Studies, Texas State University

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •