Assignment 1
In this assignment, you will read in a file, practice some basic data manipulation, and make a graph. You will be performing all your work within an R Markdown document for this assignment. You will be working with North Carolina countywide population projections provided by the North Carolina Office of State Budget and Management (https://www.osbm.nc.gov/facts-figures/population-demographics/state-demographer/countystate-population-projections/population-overview). Make sure each line of code has a descriptive comment.
Download the Data Here
Create a new R Markdown document
-
If RStudio is open, close it and do not save your environment. Open RStudio and make sure that your environment is empty, then choose File | New File | R Markdown. Note: to clear your Environment, find the button that looks like a broom in the Environment tab!
In the popup window, enter “Assignment #1” for the title and for author, enter your name. Make sure that the type (on the left) is Document and the Default Output Format is HTML.
Note that the new R Markdown document should have populated the “header” section to include the information you entered. For example, mine looks like this:
---
title: "Assignment #1"
author: "Julia Cardwell"
date: "5/1/2023"
output: html_document
---
- Save your .RMD file using the following naming convention:
- Lastname_GEOG215_Assg1.Rmd
- For example, Julia’s file would be: Cardwell_GEOG215_Assg1.Rmd
Read in and Summarize the Data
-
Add a code chunk at the top of your .Rmd. Load the tidyverse library
-
Add a third-level heading to your R Markdown file named, “Read in Data”. Below this heading, add a code chunk. In the chunk, use a relative file path to read in (using the
read_csv()command) the file named “nc_population_projections.csv” and assign it to an object callednc_pop. -
Under the code chunk you just created, insert a third-level heading to your R Markdown file named, “Data Preparation and Summary”. Create an object called
nc_pop_simplifiedthat only contains the following variables:pop_2010,pop_2020,pop_2030,pop_2040,pop_2050. -
Using in-line code, write a sentence (in plain text underneath the code chunk) that describes the number of rows and the number of columns in
nc_pop_simplified. You must use a command to complete this task, not simply write the correct answer.
Create a Plot
-
Under the code chunk you last created, insert a third-level heading to your R Markdown file named “Plot”.
-
Write a command to create a histogram of all county values in the
pop_2050column.
Calculate State-Level Values
After examining the dataset, you have probably noticed that each row represents a county in North Carolina. It is often useful to examine the “study region” as a whole (in this case, the full state). We will now aggregate the county-level data to the state-level.
-
Under the code chunk you last created, insert a third-level heading to your R Markdown file named “Aggregating data”
-
Write a command that creates a new object called
state_summarythat gets the sum of each column exceptCounty. -
Add a new column to the
state_summaryobject calledoverall_change. Use themutate()command to fill this column with the statewide population change from 2010-2050 -
In plain text below the code chunk, explain why we would get an error if we try to take the sum of the
Countyvariable. In that same plain text, add in-line code that prints the statewide population change from 2010-2050.
Deliverables
Upload your .Rmd and .html files to Canvas.