3 Course Competencies

3.1 Concept 1: Data Science & Data Acquisition

Guiding Question: What is data science, and how do we acquire data for analysis?

Explain the purpose of data science and describe each stage in a typical data science workflow
Explain the origins of data science, including the rise of big data and advances in computing power, and discuss how these developments have shaped the way knowledge is conceptualized and produced
Identify types of spatial data providers and explain the differences between “old-school”, “new-school”, and volunteer data acquisition and how these differences might impact analytical strategies/results
Determine what each row in a dataset represents (the unit of analysis) and identify the type of each variable
Identify the main characteristics of data quality, bias, and representativeness and how these influence whether a dataset is fit for a particular use

Guiding Question: How do we structure and explore data effectively?

Explain the relationship between general programming concepts and tidyverse abstractions for data wrangling
Describe the major components of the tidyverse code design philosophy (human-readable, composable, consistent, tidy) and why these components make tidyverse suitable for “doing” data science
Identify the major components of “tidy” data
Use basic tidyverse functions (select, filter, mutate, rename) to manipulate a dataset
Create basic visualizations ( non-map graphics) of a variable
Generate and interpret descriptive statistics of a variable

Guiding Question: How is spatial data unique?

Describe spatial dependence and spatial heterogeneity and explain why they make spatial data unique
Explain how spatial patterns and processes change across different scales
Describe topology and why spatial relationships matter in spatial analysis
Explain the characteristics of geographic and projected coordinate systems and their importance for spatial analysis
Evaluate the impact of MAUP and ecological fallacy on spatial data analysis and the conclusions drawn from aggregated data
Create basic maps of a variable
Calculate basic spatial descriptive statistics

Guiding Question: How do spatial questions guide spatial data science, and how do they shape analytic datasets?

Explain the distinction between spatial feature engineering (using space as a variable) and spatial pattern/process analysis (analyzing spatial structure)
Explain the role of a research question in transforming raw data to an analytical dataset
Describe the main methods non-spatial and spatial methods we have used to create analytical datasets (group_by() , summarise(), pivot() , left_join(), st_nearest_feature() , st_distance(), st_buffer(), exact_extract(), st_join()
Determine the data transformations required to convert raw spatial data into analytic datasets that can address specific spatial questions

Guiding Question: How do we transform raw spatial data into datasets ready for analysis?

Use tidyverse tools to group, summarize, reshape, and join datasets
Use tidyverse tools to create derived variables and construct aggregations
Perform spatial joins, buffering operations, raster summarization, and distance calculations using sf tools
Apply the data transformations required to convert raw spatial data into analytic datasets that can address specific spatial questions

Guiding Question: How do we visualize spatial patterns in our analytic dataset and communicate them?

Create effective thematic maps using tmap, including layering, symbology, legends, and annotation
Select appropriate visual variables and classification methods and evaluate how those choices affect interpretation

Guiding Question: How does AI change the way we write code and solve problems? How can we use AI tools effectively for coding and analysis?

Explain how AI-assisted programming fits into modern data science workflows
Identify the strengths, limitations, and risks of using AI in spatial data science
Create effective prompts that guide LLM tools towards effective code assistance
Assess the reliability and accuracy of AI-generated code

Guiding Question: How do we quantify spatial relationships?

Explain why different spatial weight and neighborhood definitions are used and how they influence analysis outcomes
Construct spatial weights and spatial weights matrices, justifying the choice of method for a specific analytic context

Guiding Question: How can we detect and interpret spatial patterns in data?

Calculate and interpret global and local spatial autocorrelation metrics (e.g., Moran’s I, LISA)
Conduct point pattern analysis using nearest-neighbor measures and density estimation
Identify and interpret clusters, hot spots, outliers, and distinguish random from structured spatial patterns.
Use bivariate maps, scatterplots, and regression to visualize and interpret spatial relationships between two variables
Use ESDA results to generate or refine hypotheses and research questions