Data Collection
About
Summary and Setup
One of the primary challenges with the utilization of digital datasets in the social sciences is a lack of data literacy skills in collecting data. This includes both data collection from existing data repositories and field-based data collections such as observation data, interviews, and surveys. This module will provide learning materials on both sides to prepare the researchers with data collection skills at beginner, intermediate, and expert levels.
Prerequisite
Getting Started
These modules are intended to be hands-on through the use of Jupyter notebooks. No prior experience is assumed in order to use these materials.
Data Sets
Each lesson will provide instructions on where to access and download the relevant datasets.
Software Setup
Discussion
Details
We will be utilizing Jupyter notebooks for most of the hands-on activities. A public JupyterHub server has been setup for these lessons, where you can login with your institutional credentials. Some of the lessons will require the use of a desktop GIS tool such as QGIS. You can download the version of QGIS for your operating system here.
Objectives
- Learn to gather geospatial data effectively
Modules
-
Data from U.S. Census Bureau
Module 1 • 2 hours
There are three broad categories of datasets available from the U.S. Census Bureau:
- Census TIGER/Line Shapefiles
- Decennial Census of Population and Housing
- American Community Survey (ACS)
-
Acquiring Raster Data using Imagery Databases
Module 2 • 30 minutes
Hard copies of maps, air photos, and other hardcopy images scanned at a high resolution are often georeferenced to provide useful context for other data layers in a GIS. Details of these raster background layers can also be digitized into separate vector layers for further analysis.
-
Acquiring Vector Datasets from Data Repositories
Module 3 • 30 minutes
An important decision faced at the start of any geospatial project (from making a basic map to advanced spatial analysis) is whether the project requires original data collected in the field or digitized from maps or can make use of pre-existing data captured by others (e.g., colleagues or online sources). For example, vector data for natural features, political boundaries, transportation networks, etc., already exist somewhere on the web and don’t need to be created from scratch. Many vector datasets are freely available on government or open-source websites, while some specialized datasets are sold as commodities. Open access datasets can vary widely by quality and accuracy depending on the methods used to create them, as well as by how much preprocessing and filtering is required to render them useful. It is therefore important to explore the data carefully to see if they will meet your needs.
-
Field-based Data Collection
Module 4 • 1 hour
- Census data can be used for planning services for certain population groups
- It can be used for site selection for new businesses or service facilities
- Public policy analysis
- Spatial analysis of hazard impact, epidemiological models, etc.
-
QField Training: Mapping Grocery Stores to Identify Potential Food Deserts
Module 5 • 1 hour
Questions
- What does a QField grocery-store mapping workflow look like from start to finish?
- What fields should we collect at groccery stores to support a later food-desert analysis?
- How do we keep field data consistent and QA-ready?
-
Network Analysis for Grocery Access in QGIS
Module 6 • 1 hour
Questions
- How can we move from collected grocery store points to a road-based accessibility analysis?
- How do shortest path and service area tools in QGIS answer different questions about grocery access?
- Why is network distance often better than straight-line distance for access studies?
Instructor
Chimdia Primus Kabuo
Discussions
Please login to view discussions.