turbines <- st_read("https://drive.google.com/uc?export=download&id=1LLl871Mv3BY7hI56kWPiaMvvk5SxFcQX")
county_wfh <- st_read("https://drive.google.com/uc?export=download&id=1kV-HKXvlrhfSKOHli4QuRfNoW6I76ihN")
tract_wfh <- st_read("https://drive.google.com/uc?export=download&id=1H-MgCZmca_0YLHg3P6zxqhVrldOTddyD")
wbgt_raster <- rast("https://drive.google.com/uc?export=download&id=1l4kZvyCf9ySy7CQlKlzl5y9z6aYHHlpa")4 Labs
4.1 General Formatting Guidance (SAMPLE FILE)
All submitted labs should follow the following formatting guidelines (unless otherwise noted). If your work does not follow these guidelines, it will need to be revised and resubmitted:
- Submission:
- You should submit two files for each lab– a .Rmd and a .html.
- You should not submit any other files.
- The files should be saved using the following convention: LASTNAME_Lab_X.Rmd
- Code:
- All R code must appear inside code chunks. Do not place any code in the body text.
- All commands MUST have a descriptive comment
- Code must follow the syntax that we have used in class
- Code chunks should not display messages or warnings in the knitted .html. If messages or warnings are printed, you need to include
message = F, warning = Fin the r header. - Each code chunk should perform one logical set of tasks/analyses (i.e. one code chunk could include all of your data manipulation, another code chunk could contain all of your data visualizations)
- Do not print large datasets or unnecessary intermediate outputs
- Maps
- Maps must use an appropriate visual variable and classification scheme
- Formatting should not detract from map meaning
- If using a basemap, transparency must be added
- Display variable must be clear (i.e. renaming legend, adding title)
- Legend must not cover any map components
- Organization
- Code chunks should be logically organized under clear, informative section headers
- Written responses should appear directly below the code chunk that produces the the relevant output
- AI
- AI tools may be utilized as a support resource; however, they should not serve as the primary approach to completing work. Students are expected to engage first with course materials, critical thinking, and independent reasoning. Any use of AI must meaningfully contribute to the student’s learning process, and students must be able to justify the role AI played in their work.
- AI use must be documented using the following structure:
- A description of the problem(s) you encountered
- How (or if) the AI use supported solving this problem
- How (or if) this use of AI will help you approach similar problems independently in the future.
- AI documentation should be included as text at the bottom of the document
4.2 Lab 1
4.2.1 Overview
In this lab, you will examine several spatial datasets to understand their structure. You will identify the variables in each dataset, including their types and geographic components, and determine the unit of analysis. You will also identify where each dataset comes from and how the data was collected. Finally, you will assess basic aspects of data quality and representativeness in order to decide whether each dataset is appropriate for a particular use or research question.
4.2.2 Specifications
| Specification |
|---|
| Student downloaded three appropriate data files from each of the sources in Part 1. |
| Student answered all Part 2 questions for each dataset in complete sentences and demonstrated competent understanding (no substantial errors). |
| Student submitted one Word document with responses and all three data files to Canvas |
4.2.3 Lab Instructions
Part 1: Downloading Spatial Data
For this lab, you will download three spatial datasets from different data providers. All datasets should be downloaded in a tabular format (either .csv or .xlsx) so they can be opened in Excel, Numbers, or Google Sheets.
iNaturalist (you will need to create an account to download data):
- Use the “Explore” page to filter observations by a specific location, or a specific species and a specific location.
- Once filters are applied, download data from the “Filters” tab
-
- Use the search bar and a query (for instance, “poverty in all counties in north carolina in 2020”) to search for a variable of interest for all North Carolina counties
- Select a table and use the Download option
-
- Browse or search for a dataset of interest (e.g., public facilities, infrastructure, environmental features).
- Open the dataset’s information page and download as .csv.
Part 2: Exploring Data
In this part of the lab, you will examine each dataset you downloaded to understand its structure, variables, and source. You are not expected to perform any analysis. Complete the questions below for each dataset. Your responses should be written in complete sentences.
- Who collected or provided the data?
- How was the data collected?
- Would this data be considered “old-school”, “new-school”, or volunteered data?
- What does one row in the dataset represent?
- What does one column in the dataset represent?
- What is the unit of analysis?
- Identify at least one numeric variable
- Identify at least one categorical variable
- Identify any fields that describe location or geography
- Are there missing values or incomplete fields?
- Identify one limitation or potential source of bias in the data
- Describe one research question this dataset could help answer
- Describe one research question that this dataset would not be useful for because of the unit of analysis or the representation decisions
4.3 Lab 2
4.3.1 Overview
In this lab, you will work with a population projection file for North Carolina provided by the North Carolina Office of State Budget and Management. You will read in a file, practice some basic data manipulation using base R and tidyverse, and make a graph.
4.3.2 Specifications
| Specification |
|---|
| Lab is submitted on Canvas and follows the general formatting guidelines. Minor formatting issues don’t prevent the work from being read and understood. |
| Reading Data: All required datasets are read into R. Responses correctly identify basic properties of the data. Minor inaccuracies are acceptable if overall understanding is clear. |
| Manipulating Data: Code produces the required variables and transformed datasets. Minor coding errors or inefficiencies are acceptable if the results are usable. |
| Describing Data: Required tables and non-map graphics are present. Written responses correctly describe patterns or distributions shown in the outputs. Response is focused on extracting meaning, not simply summarizing descriptive statistics results. |
4.3.3 Lab Instructions
Create a new .Rmd document (and save it to your GEOG215 folder).
Then complete the following tasks. Your .Rmd file should be organized so that each task has a text header, a code chunk, comments for each command, and any written components directly under the chunk. Remember to follow the Formatting Guide.
Reading Data
- Load the
tidyverse,gt, ande1071libraries and read in the data using the following command:nc_pop <- read_csv("https://drive.google.com/uc?export=download&id=1ogC0lRjEMaXLmRrZkwLVI9MbmJdk4NIy") - Answer the following questions:
- What does each row in the dataset represent?
- What does each column in the dataset represent?
- What is the data type of each column? (either use the
class()command or explore the environment tab)
Manipulating Data
- Use base R to create an object called
high_pop_countiesthat is just counties that have a a population over 100,000 in 2020 - Use tidyverse to create an object called
low_pop_countiesthat is just counties that have a population below 20,000 in 2020 - Use base R to add a variable called
changeto thenc_popobject that is the population difference between 2010 and 2050 - Use tidyverse to add a variable called
growthto thenc_popobject that assigns a value of “Growing” to counties that are projected to gain population between 2010 and 2050 and a value of “Shrinking” to populations that are projected to lose population between 2010 and 2050 - Write a command in base R that calculates the mean of the
changevariable - Write a command in tidyverse that calculates the max of the
changevariable - Write a tidyverse command that creates a new object called
simp_popthat selects just thechangevariable and renames it topop_change
Describing Data
- Create a descriptive statistics table of the
changevariable - Create two graphics using
ggplotthat help describe the data. - Answer the following questions:
- Describe what the descriptive statistics table tells you about the distribution of the variable
- Describe what the graphics tell you about the variable, focusing on what you couldn’t already learn from the descriptive statistics table
4.4 Lab 3
4.4.1 Overview
In this lab, you will practice working with spatial data in R by examining spatial and non-spatial attributes, mapping variables, and calculating descriptive and spatial descriptive statistics.
You will work with several spatial datasets:
- American Community Survey data on average commute for North Carolina counties and census tracts (vector data)
- Modeled Wet Bulb Globe Temperature on July 17, 2025 across Orange County, NC provided by Andrew Robinson (Geography PhD student) (raster data). WBGT is a measure of heat stress.
- Locations of wind turbines in the continental US from The U.S. Wind Turbine Database
4.4.2 Specifications
| Specification |
|---|
| Lab is submitted on Canvas and follows general formatting guidelines. Minor formatting issues may be present but do not interfere with readability or interpretation. |
| Reading Data: All required datasets are read into R. Responses correctly identify basic properties of the data. Minor inaccuracies are acceptable if overall understanding is clear. |
| Manipulating Data: Code produces the required variables and transformed datasets. Minor coding errors or inefficiencies are acceptable if the results are usable. |
| Describing Data: Required maps, tables, and non-map graphics are present. Written responses correctly describe patterns or distributions shown in the outputs. Response is focused on extracting meaning, not simply summarizing descriptive statistics results. |
4.4.3 Lab Instructions
Create a new .Rmd document (and save it to your GEOG215 folder).
Then complete the following tasks. Your .Rmd file should be organized so that each task has a text header, a code chunk, comments for each command, and any written components directly under the chunk. Remember to follow the Formatting Guide.
Reading Data
Load the
tidyverse,tmap,spdep,gt,terra,sfdep, andsflibrariesRead in the data using the following commands:
Answer the following questions:
- What is the CRS of each dataset? Is it a geographic coordinate system or a projected coordinate system?
- What is the geometry type of
turbines,county_wfh, andtract_wfh? - What is the resolution of the WBGT data?
Manipulating Data
- Use a tidyverse command to calculate a new field
perc_wfhin thecounty_wfhandtract_wfhobjects. (wfeEis the number of people working from home andtotalEis the total population)
Describing Work From Home
- Create a map of
perc_wfhfor both thecounty_wfhandtract_wfhobjects. - Create a descriptive statistics table of
perc_wfhfor both thecounty_wfhandtract_wfhobjects- To make a descriptive statistics table when you are using spatial data you must add the command
st_drop_geometry()before creating the table (i.e.DATA |> st_drop_geometry() |> select(VARIABLE)
- To make a descriptive statistics table when you are using spatial data you must add the command
- Create one non-map graphic of
perc_wfhfor both thecounty_wfhandtract_wfhobjects - Answer the following questions (Remember that the underlying data for the ACS is collected at the individual/household level, the only difference is the level of spatial aggregation):
- How does the distribution of
perc_wfhdiffer between the county-level and tract level data? Include discussion of central tendency, spread, shape, and frequency. - How does changing the scale of aggregation (counties vs. tracts) affect the spatial pattern you observe?
- What does this example demonstrate about the scale effect of MAUP and why it matters for interpreting spatial data?
- How does the distribution of
Describing Heat Stress in Orange County NC
- Create a map of
wbgt_raster - Summarize the cell values of
wbgt_raster
Describing Wind Turbines
- Calculate mean center of wind turbines, standard deviational ellipse of wind turbines, and weighted mean center based on a variable of interest selected from the dataset.
- To calculate the weighted mean center, you may need to drop NA values from your variable of interest. To do this, create a new object
turbine_dropped <- turbines |> drop_na(VARIABLEOFINTEREST)and use that object to calculate the weighted mean center.
- To calculate the weighted mean center, you may need to drop NA values from your variable of interest. To do this, create a new object
- Create a map that symbolizes the mean center of wind turbines, the standard deviational ellipse of wind turbines, and the weighted mean center (make sure that you add manual legend entries for the mean center and weighted mean center)
- Answer the following questions:
- What does your map reveal about the spatial distribution of wind turbines?
4.5 Lab 4
4.5.1 Overview
In this lab, you will practice formulating spatial research questions and finding, downloading, and loading spatial data that is relevant to those questions.
4.5.2 Specifications
| Specification |
|---|
| Lab is submitted on Canvas and follows the general formatting guidelines. Minor formatting issues don’t prevent the work from being read and understood. YOU MUST ALSO SUBMIT YOUR DATA FILES |
| RQ1: Student successfully reads in data using relative file paths and creates a basic map of the datasets. Written interpretation demonstrates competent understanding of the datasets. |
| RQ2: Student successfully reads in data using relative file paths and creates a basic map of the datasets. Written interpretation demonstrates competent understanding of the datasets. |
4.5.3 Lab Instructions
Create a new .Rmd document (and save it to your GEOG215 folder).
Then complete the following tasks. Your .Rmd file should be organized so that each task has a text header, a code chunk, comments for each command, and any written components directly under the chunk. Remember to follow the Formatting Guide.
RQ1
- Formulate a research question that would involve spatial feature engineering (using space to create new variables in a dataset that can be used for spatial or non-spatial modeling)
- Find and download two spatial datasets that would help you answer that research question. You should be looking for spatial files (.shp, .geojson, .tif) or a tabular file (.csv) that has coordinates in it.
- Read your spatial files into R using relative file paths.
- Make a map of each of your datasets. If you found a .csv, you will need to use the
st_as_sf()command to make the data spatial. - Answer the following questions for each of your datasets:
- What is the unit of analysis?
- What attributes does the data have that would contribute to answering your research question?
- What types of transformations would be required to be able to combine your datasets into a “tidy” object?
RQ2
- Formulate a research question that would involve spatial feature engineering (using space to create new variables in a dataset that can be used for spatial or non-spatial modeling)
- Find and download two spatial datasets that would help you answer that research question. You should be looking for spatial files (.shp, .geojson, .tif) or a tabular file (.csv) that has coordinates in it.
- Read your spatial files into R using relative file paths.
- Make a map of each of your datasets. If you found a .csv, you will need to use the
st_as_sf()command to make the data spatial. - Answer the following questions for each of your datasets:
- What is the unit of analysis?
- What attributes does the data have that would contribute to answering your research question?
- What types of transformations would be required to be able to combine your datasets into a “tidy” object?
4.6 Lab 5
4.6.1 Overview
In this lab, you will practice creating analytic datasets by using spatial feature engineering.
4.6.2 Specifications
| Specification |
|---|
| Lab is submitted on Canvas and follows the general formatting guidelines. Minor formatting issues don’t prevent the work from being read and understood. |
| The required transformations are performed to create the analytic variable for each research question. The overall workflow reflects the intended steps, even if minor errors are present |
| Each analytical dataset includes the required visualization (scatterplot and map). Written responses accurately describe what the analytic variable represents and what the distribution shows. |
4.6.3 Lab Instructions
Create a new .Rmd document (and save it to your GEOG215 folder).
Then complete the following tasks. Your .Rmd file should be organized so that each task has a text header, a code chunk, comments for each command, and any written components directly under the chunk. Remember to follow the Formatting Guide.
Reading Data
Load the
tidyverse,tmap,terra,exactextractr, andsflibrariesRead in the data using the following commands:
#all ems stations across the triangle region triangle_ems <- st_read("https://drive.google.com/uc?export=download&id=1JYZxoM3GB43AnSY2mKW1p_ixeBBh1CBJ") #all census blocks across the triangle region triangle_blocks <- st_read("https://drive.google.com/uc?export=download&id=1Q6f9wPXrN0NMZJZkAR9OFkneCfxIBiOl") #all census blocks across chapel hill ch_blocks <- st_read("https://drive.google.com/uc?export=download&id=14Jhu9ZQDRL14lQGiTUKcqqrsQcooH6BU") #raster of average summer temp in ch (in celcius) ch_summer_heat <- rast("https://drive.google.com/uc?export=download&id=1qvJepSdhFTiKZIAyn76VA9xvlIM8f5s8") #raster of canopy cover in ch (% per pixel) ch_canopy_cover <- rast("https://drive.google.com/uc?export=download&id=1a5MibyiyAJgxIqTxzSBOoMNqklErlIVR")
Analytic Dataset #1
For this analytic dataset you should create a new object that adds two columns to the ch_blocks object. One column, av_heat should be the average summer heat across each block. The second column, av_canopy should be the average tree canopy across each block.
After creating the analytic dataset, create a scatterplot dataset |> ggplot(aes(x = VARIABLE1, y= VARIABLE2)) + geom_point() that shows the relationship between summer heat and canopy coverage across Chapel Hill.
Below the code chunk, write a brief interpretation of the relationship shown in the scatterplot.
Analytic Dataset #2
For this analytic dataset you should create a new object that adds a column to the triangle_blocks object. The new column (count_ems) should include a count of each EMS station that is within 5 miles of the block.
After creating the analytic dataset, create a basic map that displays the count_ems variable across the triangle.
Below the code chunk, write a brief interpretation of the spatial pattern of the map.
4.7 Lab 6
4.7.1 Overview
In this lab, you will create well-designed maps for the analytic datasets created in the Analytic Dataset Practicum. Remember to use this valuable resource:
4.7.2 Specifications
| Specification |
|---|
| Lab is submitted on Canvas and follows the general formatting guidelines. Minor formatting issues don’t prevent the work from being read and understood. |
| Maps follow the core design instructions. Minor design or formatting issues are acceptable |
| All maps use ONLY tmap v4 code |
4.7.3 Lab Instructions
Map 1: Warm Summer Days for North Carolina Counties
This map should adhere to the following design guidelines:
- County values for percent warm summer days should be represented by dots (hint: you can use the
tm_dots()command even with polygons) - The dots should visualized using an appropriate color palette and classification scheme
- The map should include a basemap and a legend within the map frame
- The map title should be centered above the map
Map 2: Bus Stop Canopy Coverage
This map should adhere to the following design guidelines:
- Bus stop points (NOT buffers) should be visualized. You should use the
st_drop_geometry()command and aleft_join()to join your buffered values back to the original bus stop object - Bus stops should visualized using an appropriate color palette and classification scheme
- The map should include a satellite basemap and an appropriately placed legend
- The map title should be centered above the map
Map 3: Distance to Bus Stops in Chapel Hill
This map should adhere to the following design guidelines:
- Only addresses above a user-selected distance should be visualized (i.e. more than .25 miles from a bus stop)
- Chapel Hill boundaries should be included (
ch_boundaries <-st_read("https://drive.google.com/uc?export=download&id=1ievfdMpmrZBO1qBI_uILYpb1XbVXmC0b”)) - Bus stop locations should be included
- The map should include an unlabeled basemap and an appropriately placed legend
- The map title should be within the map frame
4.8 Lab 7
4.8.1 Overview
In this lab, you will practice generating effective prompts for AI assistance (LLM) with a coding task
You will work with several spatial datasets:
- Non-motorist crashes in Carrboro (crashes involving a pedestrian or bicycle)
- Carrboro block group boundaries (vector) and ACS data (tabular data)
- Carrboro roadways
4.8.2 Specifications
| Specification |
|---|
| Lab is submitted on Canvas and follows general formatting guidelines. Minor formatting issues may be present but do not interfere with readability or interpretation. |
| Reading Data: Data is downloaded and read in using relative file paths |
| Task 1: The initial prompt clearly follows the COSF framework. Any necessary revisions to the prompt or generated code have been made in order to successfully complete the task. The written responses demonstrate thoughtful engagement with the process of using an LLM for coding assistance. |
| Task 2: The initial prompt clearly follows the COSF framework. Any necessary revisions to the prompt or generated code have been made in order to successfully complete the task. The written responses demonstrate thoughtful engagement with the process of using an LLM for coding assistance. |
4.8.3 Lab Instructions
Create a new .Rmd document (and save it to your GEOG215 folder). Download the data and read in each dataset using relative file paths.
Then complete the Prompting Practice and create finalized code to complete each task. For each of the two tasks, your .Rmd should be structured as follows:
- Header for the task
- The initial text prompt you provided to an LLM (as text)
- The initial code output provided by the LLM. You should include this in a code chunk, but set
eval = F. This will prevent the code from running in the final .html. - A code chunk with the final code you used to complete the task and a map of the output
- Text below the chunk that answers these questions:
- How effective was the initial prompt in generating useful code?
- What issues or errors appeared in the initial code output?
- What modifications did you make to the prompt or the code in order to successfully complete the task? Did you use the LLM to fix these issues or did you fix them independently?
- In what ways did the LLM speed up the process or make the task easier?
- What challenges or limitations did you encounter when using the LLM?
4.9 Lab 8
4.9.1 Overview
In the Autocorrelation tutorial, we learned how to define spatial neighbors and how to use these neighborhoods to assess whether values at locations are related to values at near locations.
In this lab, you will practice applying these skills to the NC schools dataset from the autocorrelation tutorial and census tract data for the Triangle region of NC. Your goal is to define an appropriate neighborhood definition and explore global and local autocorrelation for the two following datasets:
- Percent without health insurance by census tract in the Triangle region of North Carolina
- Percent of students with chronic absenteeism in North Carolina schools before and after Covid
4.9.2 Specifications
This lab is designed to assess the Concept 8 Competencies. You’ll be evaluated on the following specifications:
| Specification |
|---|
| HTML and RMD versions the R Markdown file have been submitted. |
| The RMD is clearly organized (appropriate headings, code chunk formatting, and clean output) so that the analysis is easy to read and understand. |
| Defining Neighborhoods: Student defines neighborhoods and creates spatial weight matrices for both NC schools and Triangle tracts. Written justification explains why the chosen neighborhood definition is appropriate for this analysis. |
| Analyzing Global Autocorrelation: Student computes Moran’s I for school chronic absence percentages (before and after covid) and tract level percent population without health insurance. Written interpretation explains what the Moran’s I values indicate about spatial structure and strength of autocorrelation. |
| Analyzing Local Autocorrelation: Student computes and maps local indicators of spatial autocorrelation for both datasets. Written interpretation describes the observed clustering patterns and proposes one reasonable hypothesis for why these patterns exist. |
4.9.3 Lab Instructions
Create a new .Rmd named “LASTNAME_lab8.Rmd”. Save it into your GEOG215 folder.
Remove sample text (leaving the header and set-up chunk). Add a chunk for loading libraries and reading in data. Add the following code into that chunk:
#load libraries library(sf) library(tidyverse) library(spatstat) library(spdep) schools_sf <- st_read("https://drive.google.com/uc?export=download&id=1_X3s6sTw5zeIuXa6wDQ61r8Px7_tI7Nr") acs_tract_nc <- st_read("https://drive.google.com/uc?export=download&id=1cnz4xgdDRZvlXzN3IyvCxOETRWfrpw-0") |> filter(COUNTYFP %in% c("135", "183", "063")) |> filter(!is.na(pct_no_health_insur))Under a header called “Defining Neighbors” add a code chunk, and a written analysis below the code chunk that does the following:
- Defines neighborhoods and creates a neighborhood weight matrix for NC schools and NC tracts. The goal is to explore global and local autocorrelation in percent of population with no health insurance (tracts) and chronic absences at NC schools before and after Covid. You can use any appropriate neighborhood definition
- Justify your neighborhood definition. Why is it useful for this particular analysis?
Under a header called “Analyzing Global Autocorrelation” add a code chunk, and a written analysis below the code chunk that does the following:
- Computes Moran’s I for school chronic absences dataset (for both before and after covid) and NC tract values for no health insurance using your selected neighborhood definition.
- Describe your results. What does the Moran’s I value tell you about the structure of the data (and the strength of this structure)? Also, compare the spatial structure of chronic absences before and after covid.
Under a header called “Analyzing Local Autocorrelation” add a code chunk, and a written analysis below the code chunk that does the following:
- Calculate and map Local Indicators of Autocorrelation for all datasets
- Describe your results. What is the spatial pattern of clustering? Propose one reasonable hypothesis for why this pattern might exist. Also, compare the local spatial structure of chronic absences before and after covid.
4.10 Lab 9
4.10.1 Overview
In the Bivariate Relationships tutorial, we learned how to explore relationships with two variables. We also learned that spatial autocorrelation can appear in model residuals when nearby locations share similar values for reasons not fully captured by the observed covariates, so we learned how to use spatial regression models to help account for residual autocorrelation.
In the Point Pattern Analysis tutorial, we learned how to interpret the spatial structure of unmarked point data.
In this lab, you will practice applying these skills to a dataset representing wildfire locations in Croatan National Forest in North Carolina from 1957-2024 (from US Forest Service) and data from the American Community Survey for North Carolina census tracts.
4.10.2 Specifications
This lab is designed to assess the Concept 9 Competencies. You’ll be evaluated on the following specifications:
| Specification |
|---|
| HTML and RMD versions the R Markdown file have been submitted. |
| The RMD is clearly organized (appropriate headings, code chunk formatting, and clean output) so that the analysis is easy to read and understand |
| OLS Model: Student fits the specified OLS model and interprets the coefficient estimates in context |
| Residual Spatial Dependence: Student defines a neighborhood structure, creates a spatial weights matrix, and tests OLS residuals for spatial autocorrelation. Written interpretation explains whether residual spatial structure is present |
| Traditional Spatial Model: Student fits an appropriate spatial regression model and explains why it was chosen. Written interpretation compares the spatial model to the OLS model |
| Analyzing Global Structure: Student maps the point pattern, computes and visualizes a quadrat count (with Monte Carlo simulations) and tests it against CSR, and creates a kernel density estimate for both datasets. |
| Analyzing Local Structure: Student computes and visualizes ANN and L-function (both with Monte Carlo simulations). Student appropriately determines whether to test against CSR or an Inhomogeneous Poisson Pattern. |
| Describing Results: Student integrates global and local results to identify the dominant spatial process (first-order, second-order, or both) and proposes a specific hypothesis informed by background research. |
4.10.3 Lab Instructions
Create a new .Rmd named “LASTNAME_lab9.Rmd”. Save it into your GEOG215 folder.
Remove sample text (leaving the header and set-up chunk). Add a chunk for loading libraries and reading in data. Add the following code into that chunk:
library(sf) library(tidyverse) library(tmap) library(spdep) library(spatialreg) library(spatstat) acs_tract_nc <- st_read("https://drive.google.com/uc?export=download&id=1cnz4xgdDRZvlXzN3IyvCxOETRWfrpw-0") wildfire_points <- st_read("https://drive.google.com/uc?export=download&id=16m9HJAn5uzSyiUdhdhoboPRRi9dDDQaT") |> st_transform(crs =2264) ###THIS TURNS THE DATA INTO A POINT PATTERN OBJECT WITH A DEFINED BOUNDARY### ### USE THIS FOR PPA#### combined_wildfire <- st_union(wildfire_points) bb_wildfire <- st_convex_hull(combined_wildfire) |> as.owin() wildfire.ppp <- as.ppp(st_coordinates(wildfire_points), W = bb_wildfire)Under a header called “OLS” add a code chunk, and a written analysis below the code chunk that does the following:
- Fits an OLS model with a dependent variable of median_hh_inc and a single predictor variable that you choose from the
acs_tract_data - Describe the results of the OLS model. Interpret the relationship between median household income and the predictor variable.
- Fits an OLS model with a dependent variable of median_hh_inc and a single predictor variable that you choose from the
Under a header called “Residual Spatial Dependence” add a code chunk, and a written analysis below the code chunk that does the following:
- Test the OLS residuals for spatial dependence
- Interprets the Moran’s I statistic for the OLS residuals and explain what the result suggests about the presence and strength of spatial autocorrelation.
Under a header called “Spatial Model” add a code chunk, and a written analysis below the code chunk that does the following:
- Determine whether your analysis of the residuals indicates that you should use a spatial model. If so:
- Determine whether an SEM or SLM is better suited for this dataset and run that model.
- Interpret the results of the spatial regression model, including the estimated spatial parameter.
- Determine whether your analysis of the residuals indicates that you should use a spatial model. If so:
Under a header called “Analyzing Wildfire Points” add a code chunk, and a written analysis below the code chunk that does the following:
- Make a map (with basemap) of the
wildfire_pointsobject - Test the quadrat count against CSR using a quadrat density test (use nx = 8, ny = 8). Add Monte Carlo simulations (n = 99) to the statistical testing
- Create a KDE for wildfire counts
- Run an ANN analysis that uses Monte Carlo simulations (n = 99) to create an empirical probability. You will need to determine if your simulations should homogeneous or inhomogeneous based on your global analysis.
- Evaluate the L-function and envelope (n = 99). You will need to determine if you should use the homogeneous or inhomogeneous function (and simulations) based on your global analysis. Because our border is not a rectangle, we can’t use the “iso” correction, so use the “border” correction instead.
- In text below the chunk, identify whether there is evidence for the pattern of wildfires reflecting first-order processes, second-order processes, or a combination of both by clearly interpreting your analytical results. Based on brief background research on wildfires and the study area, propose one specific, plausible hypothesis about an underlying process that could be contributing to the spatial pattern.
- Make a map (with basemap) of the