1 Introduction

This policy brief makes an attempt to answer the question of “Do households value transit-rich neighborhoods compared to others?” by examining Chicago’s “L” rail transit system operated by CTA (Chicago Transit Authority). It is the second oldest rapid rail transit system in America after New York city’s elevated lines running through the city of Chicago and its surrounding suburbs in the Cook County, Illinois. The analysis looks into the census data for years 2011 and 2018 along with the CTA transit stops data to come up with indicators that help in answering the question posed, and finally provides policy-relevant conclusions and recommendations.

2 Setup

This section loads libraries required to do geo-processing in R. In addition, it also defines functions for styling plots and calculating quantiles.

######### Importing the required libraries #########
library(tidyverse)
library(tidycensus)
library(sf)
library(kableExtra)
library(RSocrata)

options(scipen=999)
options(tigris_class = "sf")

######### Functions for styling plots #########
mapTheme <- function(base_size = 12) {
  theme(
    text = element_text( color = "black"),
    plot.title = element_text(size = 16,colour = "black"),
    plot.subtitle=element_text(face="italic"),
    plot.caption=element_text(hjust=0),
    axis.ticks = element_blank(),
    panel.background = element_blank(),axis.title = element_blank(),
    axis.text = element_blank(),
    axis.title.x = element_blank(),
    axis.title.y = element_blank(),
    panel.grid.minor = element_blank(),
    panel.border = element_rect(colour = "black", fill=NA, size=2),
    strip.text.x = element_text(size = 14))
}

plotTheme <- function(base_size = 12) {
  theme(
    text = element_text( color = "black"),
    plot.title = element_text(size = 16,colour = "black"),
    plot.subtitle = element_text(face="italic"),
    plot.caption = element_text(hjust=0),
    axis.ticks = element_blank(),
    panel.background = element_blank(),
    panel.grid.major = element_line("grey80", size = 0.1),
    panel.grid.minor = element_blank(),
    panel.border = element_rect(colour = "black", fill=NA, size=2),
    strip.background = element_rect(fill = "grey80", color = "white"),
    strip.text = element_text(size=12),
    axis.title = element_text(size=12),
    axis.text = element_text(size=10),
    plot.background = element_blank(),
    legend.background = element_blank(),
    legend.title = element_text(colour = "black", face = "italic"),
    legend.text = element_text(colour = "black", face = "italic"),
    strip.text.x = element_text(size = 14)
  )
}

palette5 <- c("#f0f9e8","#bae4bc","#7bccc4","#43a2ca","#0868ac")

######### Helper Functions #########
qBr <- function(df, variable, rnd) {
  if (missing(rnd)) {
    as.character(quantile(round(df[[variable]],0),
                          c(.01,.2,.4,.6,.8), na.rm=T))
  } else if (rnd == FALSE | rnd == F) {
    as.character(formatC(quantile(df[[variable]]), digits = 3),
                 c(.01,.2,.4,.6,.8), na.rm=T)
  }
}

q5 <- function(variable) {as.factor(ntile(variable, 5))}

3 Data Wrangling

This section describes the steps taken to load and pre-process the tracts and CTA transit stops data:

  • Census Tracts

    • Load data for years 2011 and 2018 for Cook Country, IL
    • Keep tracts that have population greater than 0.
    • Combine the two years of data.
  • CTA stops

    • Load the CTA stops data using the API
    • Remove duplicate stops occurring due to different bounds in the same station.
  • Identifying tracts as belonging to TOD/Non-TOD region

  • Accounting for inflation in rent and household income across the two years.

######### Loading and Cleaning Census Tracts Data #########

# CRS used for transformation
crs <- st_crs("+proj=lcc +lat_1=40.88333333333333 +lat_2=41.95 +lat_0=40.16666666666666 +lon_0=-77.75 +x_0=600000.0000000001 +y_0=0 +ellps=GRS80 +datum=NAD83 +to_meter=0.3048006096012192 no_defs")

# Loading and cleaning tracts data for year 2011
tracts_2011 <-  
  get_acs(geography = "tract", variables = c("B25026_001E","B02001_002E","B15001_050E",
                                             "B15001_009E","B19013_001E","B25058_001E",
                                             "B06012_002E"), 
          year=2011, state='IL', county='Cook County', geometry=T, output="wide") %>% 
  st_transform(crs=crs) %>% 
  rename(TotalPop = B25026_001E, 
         Whites = B02001_002E,
         FemaleBachelors = B15001_050E, 
         MaleBachelors = B15001_009E,
         MedHHInc = B19013_001E, 
         MedRent = B25058_001E,
         TotalPoverty = B06012_002E) %>%
  dplyr::select(-NAME, -starts_with("B")) %>%
  filter(TotalPop > 0) %>% 
  mutate(pctWhite = ifelse(TotalPop > 0, ifelse(Whites / TotalPop > 1, 1, Whites / TotalPop),0),
         pctBachelors = ifelse(TotalPop > 0, ((FemaleBachelors + MaleBachelors) / TotalPop),0),
         pctPoverty = ifelse(TotalPop > 0, TotalPoverty / TotalPop, 0),
         year = "2011") %>%
  dplyr::select(-Whites, -FemaleBachelors, -MaleBachelors, -TotalPoverty) 

# Loading and cleaning tracts data for 2018
tracts_2018 <-  
  get_acs(geography = "tract", variables = c("B25026_001E","B02001_002E","B15001_050E",
                                             "B15001_009E","B19013_001E","B25058_001E",
                                             "B06012_002E"), 
          year=2018, state='IL', county='Cook County', geometry=T, output="wide") %>% 
  st_transform(crs=crs) %>% 
  rename(TotalPop = B25026_001E, 
         Whites = B02001_002E,
         FemaleBachelors = B15001_050E, 
         MaleBachelors = B15001_009E,
         MedHHInc = B19013_001E, 
         MedRent = B25058_001E,
         TotalPoverty = B06012_002E) %>%
  dplyr::select(-NAME, -starts_with("B")) %>%
  filter(TotalPop > 0) %>% 
  mutate(pctWhite = ifelse(TotalPop > 0, ifelse(Whites / TotalPop > 1, 1, Whites / TotalPop),0),
         pctBachelors = ifelse(TotalPop > 0, ((FemaleBachelors + MaleBachelors) / TotalPop),0),
         pctPoverty = ifelse(TotalPop > 0, TotalPoverty / TotalPop, 0),
         year = "2018") %>%
  dplyr::select(-Whites, -FemaleBachelors, -MaleBachelors, -TotalPoverty) 

# Combining tracts data
allTracts <- rbind(tracts_2011, tracts_2018)

######### Loading and Cleaning Transit Stop Data #########

cta_stops <- st_read("https://data.cityofchicago.org/resource/8pix-ypme.geojson") %>% 
  select(stop_name, station_descriptive_name) %>% 
  group_by(station_descriptive_name) %>% 
  slice(1) %>% 
  st_transform(st_crs(tracts_2011))

######### TOD/Non-TOD Tracts #########

# Creating a 0.5 mile buffer around transit stops
buffer <- st_union(st_buffer(cta_stops, 2640)) %>%
  st_sf() 

# Classifying tracts as TOD/Non-TOD and accounting for inflation 
allTracts.group <- 
  rbind(
    st_centroid(allTracts)[buffer,] %>%
      st_drop_geometry() %>%
      left_join(allTracts) %>%
      st_sf() %>%
      mutate(TOD = "TOD"),
    st_centroid(allTracts)[buffer, op = st_disjoint] %>%
      st_drop_geometry() %>%
      left_join(allTracts) %>%
      st_sf() %>%
      mutate(TOD = "Non-TOD")) %>%
  mutate(MedRent.inf = ifelse(year == "2011", MedRent * 1.13, MedRent),
         MedHHInc.inf = ifelse(year == "2011", MedHHInc * 1.13, MedHHInc))

4 Comparing Census Variables Across Time and Space

This section examines how census variables, like median rent, median household income, percent poverty and percent bachelors vary in TOD and Non-TOD areas across the years 2011 and 2018.

4.1 Median Rent

# Median rent 
ggplot()+
  geom_sf(data=st_union(allTracts.group))+
  geom_sf(data=allTracts.group, aes(fill=q5(MedRent.inf)), colour=NA)+
  geom_sf(data=buffer, colour='red', fill=NA)+
  scale_fill_manual(values = palette5,
                    labels = qBr(allTracts.group, "MedRent.inf"),
                    name = "Median Rent\n(Quantile Breaks)")+
  facet_wrap(~year)+
  labs(title = "Median rent, 2011-2018",
       subtitle="Cook County, IL",
       caption = "Figure 1: Median Rent by census tract for years 2011 and 2018. The red borders denote area close to the CTA stations") +
  mapTheme() + theme(plot.title = element_text(size=22))

From Figure 1 it an can be seen that for the TOD regions, the median rent has increased over the two years, but this increase is only significant around the Chicago Loop and North/Northwest side of the Loop. Proximity to transportation hub can be one of the reasons for this increase in the rent for this region, in addition to others like proximity to corporate offices and city attractions in the downtown and business districts. On the other hand there does not seem to be any significant difference in the median rent for Non-TOD areas.

4.2 Household Income

# Household Income
ggplot()+
  geom_sf(data=st_union(allTracts.group))+
  geom_sf(data=allTracts.group, aes(fill=q5(MedHHInc.inf)), colour=NA)+
  geom_sf(data=buffer, colour='red', fill=NA)+
  scale_fill_manual(values = palette5,
                    labels = qBr(allTracts.group, "MedHHInc.inf"),
                    name = "Median Household Income\n(Quantile Breaks)")+
  facet_wrap(~year)+
  labs(title = "Median Household Income, 2011-2018",
       subtitle="Cook County, IL",
       caption = "Figure 2: Median Household Income by census tract for years 2011 and 2018. The red borders denote area close to the CTA stations") +
  mapTheme() + theme(plot.title = element_text(size=22))

From Figure 2 it can be seen that for the median household income has increased for specific TOD regions such as the Chicago Loop and North/Northwest side of the Loop. This is similar to what was also seen in Figure 1. One can argue that the TOD development has led to the displacement of lower household incomes due to the spurring increases in the land and home prices and this has led to them being replaced by higher household incomes, also commonly known as gentrification.

4.3 Poverty

# Poverty
ggplot()+
  geom_sf(data=st_union(allTracts.group))+
  geom_sf(data=allTracts.group, aes(fill=q5(pctPoverty)), colour=NA)+
  geom_sf(data=buffer, colour='red', fill=NA)+
  scale_fill_manual(values = palette5,
                    labels = qBr(allTracts.group, "pctPoverty", F),
                    name = "Pct Poverty\n(Quantile Breaks)")+
  facet_wrap(~year)+
  labs(title = "Percent Poverty, 2011-2018",
       subtitle="Cook County, IL",
       caption = "Figure 3: Percent Poverty by census tract for years 2011 and 2018. The red borders denote area close to the CTA stations") +
  mapTheme() + theme(plot.title = element_text(size=22))

From Figure 3 it can be seen that there has been no significant change in the percent poverty around the TOD regions across the two years, except for the Northwest side of the Chicago Loop where we see a slight decrease. It is also interesting to see that that over the years, the percent poverty in the Southern part of Chicago has been significantly higher compared the the Northern part, and this could be one of the reasons why there has been minimal TOD around South Chicago.

4.4 Percent Bachelors

# Percent Bachelors
ggplot()+
  geom_sf(data=st_union(allTracts.group))+
  geom_sf(data=allTracts.group, aes(fill=q5(pctBachelors)), colour=NA)+
  geom_sf(data=buffer, colour='red', fill=NA)+
  scale_fill_manual(values = palette5,
                    labels = qBr(allTracts.group, "pctBachelors", F),
                    name = "Pct Bachelors\n(Quantile Breaks)")+
  facet_wrap(~year)+
  labs(title = "Percent Bachelors, 2011-2018",
       subtitle="Cook County, IL",
       caption = "Figure 4: Percent Bachelors by census tract for years 2011 and 2018. The red borders denote area close to the CTA stations") +
  mapTheme() + theme(plot.title = element_text(size=22))

From Figure 4 it can be seen that the percentage of bachelors in the TOD regions have not changed significant over the two years. Also, it should be noted that the TOD regions in the north of the Chicago Loop have higher percent bachelors compared to other TOD regions. One can attribute this to the fact that students and young professionals tend to live and work around the Northern side of the Loop.

5 Grouped Bar Plot

This section displays a grouped bar plot to examine changes in Census variables such as Median Rent, Percent Poverty, Percent Bachelors and Household income in TOD/Non-TOD regions across the year 2011 to 2018.

########## Comparing census variables using grouped bar plot ##########

allTracts.Summary <- 
  st_drop_geometry(allTracts.group) %>%
  group_by(year, TOD) %>%
  summarize(Rent = mean(MedRent, na.rm = T),
            `Household Income` = mean(MedHHInc, na.rm = T),
            `Pct Bachelors` = mean(pctBachelors, na.rm = T),
            `Pct Poverty` = mean(pctPoverty, na.rm = T))

allTracts.Summary %>%
  gather(Variable, Value, -year, -TOD) %>%
  ggplot(aes(year, Value, fill = TOD)) +
  geom_bar(stat = "identity", position = "dodge") +
  facet_wrap(~Variable, scales = "free", ncol=3) +
  scale_fill_manual(values = c("#bae4bc", "#0868ac")) +
  labs(title = "Indicator differences across time and space",
       caption = "Figure 5: Changes in indicator variables by TOD and non-TOD census tracts for 2011 and 2018") +
  plotTheme() + 
  theme(legend.position="bottom")

6 Tabular Comparison

########## Comparing census variables using table ##########
allTracts.Summary %>%
  unite(year.TOD, year, TOD, sep = ": ", remove = T) %>%
  gather(Variable, Value, -year.TOD) %>%
  mutate(Value = round(Value, 2)) %>%
  spread(year.TOD, Value) %>%
  kable() %>%
  kable_styling() %>% 
  footnote(general_title = "\n",
           general = "Table 1: Changes in indicator variables by TOD and non-TOD census tracts for 2011 and 2018")
Variable 2011: Non-TOD 2011: TOD 2018: Non-TOD 2018: TOD
Household Income 57828.40 52800.74 65018.65 64547.99
Pct Bachelors 0.01 0.03 0.01 0.03
Pct Poverty 0.16 0.23 0.16 0.21
Rent 877.96 866.83 1011.19 1069.35

Table 1: Changes in indicator variables by TOD and non-TOD census tracts for 2011 and 2018

From Figure 5 and Table 1, it can be seen that the household income and rent in both TOD and Non-TOD regions have increased over the years and the rate of increase for these two indicators in TOD regions has been higher compared to Non-TOD regions. On the other hand, the Percent Poverty has remained stagnant in non-TOD regions while it has slightly decreased in the TOD regions. There has been no significant improvement in Percent Bachelors in both TOD and Non-TOD region across the two years.

7 Graduated Symbols Maps

This section deals with creating graduated symbol maps of population and rent within 0.5 mile of each transit station.

7.1 Population

stops_in_tracts <- st_join(st_buffer(cta_stops, 2640), st_centroid(allTracts.group)) %>% 
  st_drop_geometry() %>% 
  group_by(station_descriptive_name, year) %>% 
  summarise(Rent = mean(MedRent.inf, na.rm=T),
            Population = mean(TotalPop, na.rm = T)) %>% 
  ungroup() %>% 
  left_join(cta_stops, by=c("station_descriptive_name" = "station_descriptive_name")) %>% 
  st_sf() %>% 
  st_transform(crs=4326) 

stops_coordinates <- stops_in_tracts %>% 
  st_coordinates() %>% 
  as.data.frame() %>% 
  rename(c("lat"="X", "long"="Y"))

stops_in_tracts <- cbind(stops_in_tracts, stops_coordinates) %>% drop_na()

# Population
ggplot() + 
  geom_sf(data = st_union(allTracts.group) %>% st_transform(crs = 4326), fill = '#f0f0f0', color = 'white') + 
  geom_point(data = stops_in_tracts, aes(x=lat, y=long, size = q5(Population)), color="red", alpha=0.5) + 
  facet_wrap(~year) + 
  scale_size_manual(values = c(1,2,3,4,5), labels = qBr(stops_in_tracts, "Population"))+
  guides(size=guide_legend("Total Population\n(Quantile Breaks)")) + 
  labs(title = "Population within 0.5 miles of each transit station, 2011-2018",
       caption = "Figure 6") + 
  mapTheme() +
  theme(plot.title = element_text(size = 30, face = "bold"),
    legend.title=element_text(size=20), 
    legend.text=element_text(size=20),
    plot.caption = element_text(size=20))

From Figure 6 we can see that there has been an increase in population in and around the transit stops close to the Chicago Loop and not much difference is observed for other transit stations.

7.2 Rent

# Rent 
ggplot() + 
  geom_sf(data = st_union(allTracts.group) %>% st_transform(crs = 4326), fill = '#f0f0f0', color = 'white') + 
  geom_point(data = stops_in_tracts, aes(x=lat, y=long, size = q5(Rent)), color="red", alpha=0.6) + 
  facet_wrap(~year) + 
  scale_size_manual(values = c(1,2,3,4,5), labels = qBr(stops_in_tracts, "Rent"))+
  guides(size=guide_legend("Median Rent\n(Quantile Breaks)")) + 
  labs(title = "Rent within 0.5 miles of each transit station",
       caption = "Figure 7") + 
  mapTheme() + 
  theme(plot.title = element_text(size = 30, face = "bold"),
    legend.title=element_text(size=20), 
    legend.text=element_text(size=20),
    plot.caption = element_text(size=20))

From Figure 7 it seems like the rent has increased around some transit stations near the Chicago Loop, although it is hard to make strong inferences due to the cluttering of the points in this region. For the other transit stops, we do not see any significant change across years.

8 Rent as a function of Distance

multipleRingBuffer <- function(inputPolygon, maxDistance, interval) 
{
  #create a list of distances that we'll iterate through to create each ring
  distances <- seq(0, maxDistance, interval)
  #we'll start with the second value in that list - the first is '0'
  distancesCounter <- 2
  #total number of rings we're going to create
  numberOfRings <- floor(maxDistance / interval)
  #a counter of number of rings
  numberOfRingsCounter <- 1
  #initialize an otuput data frame (that is not an sf)
  allRings <- data.frame()
  
  #while number of rings  counteris less than the specified nubmer of rings
  while (numberOfRingsCounter <= numberOfRings) 
  {
    #if we're interested in a negative buffer and this is the first buffer
    #(ie. not distance = '0' in the distances list)
    if(distances[distancesCounter] < 0 & distancesCounter == 2)
    {
      #buffer the input by the first distance
      buffer1 <- st_buffer(inputPolygon, distances[distancesCounter])
      #different that buffer from the input polygon to get the first ring
      buffer1_ <- st_difference(inputPolygon, buffer1)
      #cast this sf as a polygon geometry type
      thisRing <- st_cast(buffer1_, "POLYGON")
      #take the last column which is 'geometry'
      thisRing <- as.data.frame(thisRing[,ncol(thisRing)])
      #add a new field, 'distance' so we know how far the distance is for a give ring
      thisRing$distance <- distances[distancesCounter]
    }
    
    
    #otherwise, if this is the second or more ring (and a negative buffer)
    else if(distances[distancesCounter] < 0 & distancesCounter > 2) 
    {
      #buffer by a specific distance
      buffer1 <- st_buffer(inputPolygon, distances[distancesCounter])
      #create the next smallest buffer
      buffer2 <- st_buffer(inputPolygon, distances[distancesCounter-1])
      #This can then be used to difference out a buffer running from 660 to 1320
      #This works because differencing 1320ft by 660ft = a buffer between 660 & 1320.
      #bc the area after 660ft in buffer2 = NA.
      thisRing <- st_difference(buffer2,buffer1)
      #cast as apolygon
      thisRing <- st_cast(thisRing, "POLYGON")
      #get the last field
      thisRing <- as.data.frame(thisRing$geometry)
      #create the distance field
      thisRing$distance <- distances[distancesCounter]
    }
    
    #Otherwise, if its a positive buffer
    else 
    {
      #Create a positive buffer
      buffer1 <- st_buffer(inputPolygon, distances[distancesCounter])
      #create a positive buffer that is one distance smaller. So if its the first buffer
      #distance, buffer1_ will = 0. 
      buffer1_ <- st_buffer(inputPolygon, distances[distancesCounter-1])
      #difference the two buffers
      thisRing <- st_difference(buffer1,buffer1_)
      #cast as a polygon
      thisRing <- st_cast(thisRing, "POLYGON")
      #geometry column as a data frame
      thisRing <- as.data.frame(thisRing[,ncol(thisRing)])
      #add teh distance
      thisRing$distance <- distances[distancesCounter]
    }  
    
    #rbind this ring to the rest of the rings
    allRings <- rbind(allRings, thisRing)
    #iterate the distance counter
    distancesCounter <- distancesCounter + 1
    #iterate the number of rings counter
    numberOfRingsCounter <- numberOfRingsCounter + 1
  }
  
  #convert the allRings data frame to an sf data frame
  allRings <- st_as_sf(allRings)
}

# Finding which tracts belong to which buffer
tracts.distance <- st_join(
  st_centroid(dplyr::select(allTracts.group, GEOID, year)),
  multipleRingBuffer(buffer, 52800, 2640)) %>% 
  st_drop_geometry() %>%
  left_join(dplyr::select(allTracts.group, GEOID, MedRent, year), 
            by=c("GEOID"="GEOID", "year"="year")) %>%
  st_sf() %>% 
  drop_na() %>% 
  mutate(distance = distance / 5280) 

# Line plot
tracts.distance %>% 
  st_drop_geometry() %>% 
  group_by(year, distance) %>% 
  summarise(
    mean_rent = mean(MedRent)
  ) %>% 
  ggplot() +
  geom_line(aes(x=distance, y = mean_rent, color=year), size=1.5) + 
  geom_point(aes(x=distance, y = mean_rent, color=year), size=2) + 
  scale_color_manual(values = c( '#BAE4BC', '#0868AC')) + 
  labs(
    title = "Rent as a function of distance to station",
    caption = "Figure 8: Average median rent for each half mile from CTA station",
    x = "Miles",
    y = "Average Rent",
    color = "Year"
  )+ 
  plotTheme()

From Figure 8 we can see that for the first 6 miles, there is a lot of variation in the rent with no significant higher rent for the regions that are closer to the transit stations. This can be attributed to the fact that the ring buffer extends in both the Northern and the Southern part of Chicago where we have seen differential effects on rent due to TOD (Figure 1). After the 6 miles we see a steady increase in the rent values.

9 Relation between crime, transit and rent

# Look at package to load entire data

crimes.2011 <- read.socrata("https://data.cityofchicago.org/resource/qnrb-dui6.json") %>% 
  filter(primary_type == "ROBBERY" & location.latitude > 37) %>% 
  na.omit() %>%
  st_as_sf(coords = c("location.longitude", "location.latitude"), crs = 4326, agr = "constant") %>%
  select('case_number') %>% 
  st_transform(crs=st_crs(tracts_2011))

crimes.2018 <- read.socrata("https://data.cityofchicago.org/resource/3i3m-jwuy.json") %>% 
  filter(primary_type == "ROBBERY") %>% 
  na.omit() %>%
  st_as_sf(coords = c("location.longitude", "location.latitude"), crs = 4326, agr = "constant") %>% 
  select('case_number') %>% 
  st_transform(crs=st_crs(tracts_2011))

allTracts.group.filtered.2011 <- allTracts.group %>% 
  select('GEOID', 'MedRent.inf', 'TOD', 'year') %>% 
  filter(year == '2011')

allTracts.group.filtered.2018 <- allTracts.group %>% 
  select('GEOID', 'MedRent.inf', 'TOD', 'year') %>% 
  filter(year == '2018')

# crime 2011
tracts.crime.2011 <- allTracts.group.filtered.2011 %>% 
  st_intersection(crimes.2011) %>% 
  group_by(GEOID) %>% 
  summarise(
    crime_count = n_distinct(case_number)
  ) %>% 
  st_drop_geometry() %>% 
  right_join(allTracts.group.filtered.2011) %>% 
  st_sf()

# crimes 2018
tracts.crime.2018 <- allTracts.group.filtered.2018 %>% 
  st_intersection(crimes.2018) %>% 
  group_by(GEOID) %>% 
  summarise(
    crime_count = n_distinct(case_number)
  ) %>% 
  st_drop_geometry() %>% 
  right_join(allTracts.group.filtered.2018) %>% 
  st_sf()

# combine the two
tracts.crime <- rbind(tracts.crime.2011, tracts.crime.2018) 

# Crime across years

ggplot() +
  geom_sf(data = st_union(tracts.crime))+
  geom_sf(data = tracts.crime, aes(fill = q5(crime_count)), color=NA) +
  geom_sf(data = buffer, colour='red', fill=NA) +
  facet_wrap(~year)+
  scale_fill_manual(values = palette5,
                    labels = qBr(tracts.crime, "crime_count"),
                    name = "Burglary Count\n(Quantile Breaks)"
                    )+
  labs(title = "Total Burglaries, 2011-2018", 
       subtitle = "Chicago City, IL",
       caption = "Figure 9: The red borders denote area close to the transit stations")+
  mapTheme()

# Summary statistics

allTracts.Summary <- 
  st_drop_geometry(allTracts.group) %>%
  group_by(year, TOD) %>%
  summarize(Rent = mean(MedRent, na.rm = T),
            Population = mean(TotalPop, na.rm = T),
            Percent_White = mean(pctWhite, na.rm = T),
            Percent_Bach = mean(pctBachelors, na.rm = T),
            Percent_Poverty = mean(pctPoverty, na.rm = T))
  
crime.Summary <- tracts.crime %>% 
  st_drop_geometry() %>% 
  group_by(year, TOD) %>% 
  summarise(
    Rent = mean(MedRent.inf, na.rm = T),
    Burglary = mean(crime_count, na.rm = T) 
  )

# Grouped Barplot 
crime.Summary %>% 
  gather(Variable, Value, -year, -TOD) %>%
  ggplot(aes(year, Value, fill = TOD)) +
  geom_bar(stat = "identity", position = "dodge") +
  facet_wrap(~Variable, scales = "free", ncol=5) +
  scale_fill_manual(values = c("#bae4bc", "#0868ac")) +
  labs(title = "Indicator differences across time and space", 
       caption = "Figure 10") +
  plotTheme() + theme(legend.position="bottom")

The crime type picked for this analysis was Burglary (This analysis was conduced only in Chicago region and not the entire Cook County). From Figure 9 and 10 we see that the count of Burglary has decrease in both TOD and Non-TOD regions over the years which can be attributed to improved policing over the years. It is also observed that as the count of Burglary went down, the rent increased in the TOD regions and marginally in the Non-TOD regions over the years.

10 Conclusion

From the above analysis, it can be concluded that the residents in North TOD region, mainly covering the Chicago Loop and North/Northwest side of Chicago Loop, may be willing to pay more to live in transit rich neighborhoods. This conclusion can be reached due to the significant increase in Median Rent and Household Income and, decrease in Percent Poverty and crime in the above mentioned areas from 2011 to 2018. However this potential willingness to pay more might not be just due of proximity to transportation hub, but also due to the Downtown and Business District lying in that area, and hence residents of Chicago may be willing to pay more to access amenities such as corporate offices, city attractions and trendy restaurants.

On the other hand, the TOD region in South Chicago, did not see any significant improvements in rent, income or poverty. Hence it is quite possible that the residents may not be willing to pay more in South TOD region of Chicago. There are several factors that provide evidence for this conclusion, the first being the level of crime in South Chicago. Although crime reduced in South Chicago over the years, it is still much higher on average than North Chicago. Coupling this factor with multiple other factors such as high poverty and low household income, it is safe to say that this does not make the South TOD region an ideal place to live. The potential reason behind this could be the legacy of segregation in Chicago, loss of manufacturing and industrial jobs and lack of educational opportunities in South Chicago. All these reasons hinder the transit oriented development in South Chicago.

The following policy recommendations that can be made are:

  1. Provide Manufacturing and Industrial jobs in South Chicago especially around the South TOD Region which would in turn reduce poverty and increase household income.

  2. Increase educational resources in areas in and around South TOD so that businesses and corporate offices would want to establish in South TOD as well or basically break the virtual wall between north and south Chicago.

  3. Provide affordable housing to racial minorities especially African Americans so as to break the legacy of segregation which could be the main hindrance behind transit oriented development in South Chicago.

In addition to this, it is important to keep in mind that all the analysis presented in above has been conducted on census tracts that are arbitrarily drawn and likely to change over years (Modifiable Areal Unit Problem). This implies that the inference we make from our spatial plots or even the tract level aggregated values would come with some error. Also, the Ecological Fallacy has not been taken into account, since the underlying distribution of the census variables across the tract are unknown, which leads to questioning whether the mean of a variable is the best indicator.